Wednesday, January 15, 2020

The gross misunderstanding in educational accountability


For a word used with ease in educational policy circles, accountability is a term that is surprisingly misunderstood and misused.

Seeing this is relatively simple. Ask an audience to brainstorm a list of terms they associate with accountability and a pattern will quickly emerge. Many of the words will be positive such as:
  • Transparency
  • Effectiveness
  • Responsibility
  • Outcomes
And many of the words and phrases will be negative, such as:
  • Feet to the fire
  • Testing due to lack of trust
  • Blame
  • Shame
If you list these words in two columns on a sheet of paper what you will be observing are the two sides to accountability.

The negative terms represent what happens when an organization refuses to be accountable and/or is perceived as failing. In that case, accountability is something imposed on that organization by outside stakeholders for the purpose of bringing the organization in line. Such an accountability focuses the organization on failure prevention at the expense of everything else.

The positive terms represent what happens in effective organizations. These are organizations that internalize the principles behind these terms and attempt to exemplify them in their efforts.

This type of accountability focuses the organization on how best to sustain itself long-term, and how best to communicate its effort to its stakeholders.

Both types of accountability are perfectly valid depending on the circumstance.

What should be clear is that the objective for any organization should be an accountability focused on long-term sustainable excellence. This properly aligns the organization with its long-term goals and the idea of continuous improvement.

What should also be clear is that imposing an accountability of failure prevention by stakeholders must be performed thoughtfully. Its intent is not long-term sustainable excellence, but just the opposite: an immediate, short-term failure correction. The intent of an imposed accountability is to focus the organization and its resources on correcting the failure at the earliest possible moment or the organization’s existence may well be at risk.

An imposed accountability’s purpose is thus temporary: to force an immediate correction after which the organization can turn its focus towards long-term sustainable excellence. When an organization is having its feet held to the fire its job is not long-term sustainable excellence but something else. The sooner it can correct its errors and turn its attention towards long-term sustainable excellence, the sooner it can return to a state of effectiveness.

It would be deeply illogical and harmful to any organization required to operate in the perpetual shadow of an imposed accountability when the goal is long term effectiveness. The reason for this is simple: it would make the formal focus of the organization failure prevention, and thus attempts at long-term effectiveness would be perceived as secondary.

Even if the organization’s leaders recognized they were in an illogical system and attempted to focus stakeholders on their long-term approach, the fact that the imposed accountability was at the behest of stakeholders while the long-term approach was not, means the imposed accountability is likely to triumph. At best this would cause any positive message to be diluted, and at worst ignored or not believed.

Getting the balance right is always a challenge as organizations consist of lots of moving parts and it will regularly be the case that some of those parts are deserving of an imposed accountability. So long as such accountabilities are temporary that part of the organization can correct itself and return to a focus on the long term the accountability system. In that case the overall accountability system will be seen as contributing to the overall well-being of the organization.

The objective must be for any organization to spend the majority of its existence in an accountability focused on long-term sustainable excellence, and as little time as possible under the pressure of an imposed accountability. Only then will it be in a position to deliver effectively for its stakeholders

Sunday, January 5, 2020

How standardized tests do what they do (which isn’t what most people think)

Standardized test is the name most people assign to the tests used in state accountability systems, commercially available norm-referenced tests, and college admittance tests such as the ACT and SAT. I have long encouraged folks to drop the term “standardized,” since that merely refers to the conditions under which tests can be administered, rather than what this narrow family of tests are and do.

Instead, I prefer to call them predictive tests. This describes what they are intended to do.

I have also strongly encouraged a more critical use of vocabulary regarding predictive testing. This is because of the massive confusion that results from the plethora of terms now applied to testing that don’t mean what most people think, such as standards-based, or criterion-referenced.

What sets a predictive test apart from all other forms of testing is its ability to produce predictive scores. Simply (and crudely) put, if I am slightly above average this year you can predict that I will probably be slightly above average next year. If I am not, if I am well-above or below average, you can note it and begin the search for causes. Perhaps there are lessons to be learned or perhaps not, but as a signal for where to look such test scores have some use.

Confusion is created when people presume that their names for testing, such as standards-based or criterion-referenced, are parallel forms of testing to a predictive test. This is inaccurate. If the tests produce consistent results across administrations, they are first and foremost predictive tests. You may have drawn the content from a state’s written standards and labeled it a standards-based test, or drawn a line in the sand and assigned it a label, in which case you created a criterion (as you have assigned a score meaning that is external to the test). Or you may have conducted a comparative study after the fact that allowed you to apply norms. Regardless, the style of tests in which you are operating is predictive.

And, by the way, creating this narrow sort of instrument requires real specialization and training, as the sorting function will only occur in a consistent fashion with test items that perform within a narrow set of statistical criteria, and that combine to create a specific effect. This is a far cry from a teacher building a test to understand the effectiveness of their teaching or whether students learned a lesson—that isn’t even in the same ballpark. The last thing a teacher should care about regarding learning is whether their items sort kids into a curve, while that concern is first and foremost in order for a predictive test to work.

The greatest mistake people make with a predictive test is to presume that the consistency in the results has more meaning than it does, when the fact is that the meaning is surprisingly limited.

The consistency is created by first finding average and then calculating how far from average each test taker is. Since averages are reasonably consistent over time, as is a student’s relationship to average, the results will be as well.

The usefulness in this is that a student’s position is predictive as described above, and movement can be explored for potential lessons. The resulting orderings are also useful in that they show broad patterns behind them, often regarding socioeconomics, gender, race, etc. As researchers identify these and policies and procedures are put in place, future parallel instruments can be used to understand the effectiveness of those policies and procedures by noting whether or not negative patterns dissipate.

A perfect ordering on an entire domain is simply not possible—that would result in a test that was thousands of items long. Instead, test makers locate a few items that will order students about the same as if the ordering were done on the entire domain. This makes the test a proxy for the domain, and still useful in spite of the fact that it is not a statistically representative sample of it. So long as the ordering on the limited selection of content will be roughly the same as on the entire body of content it is still useful in the hands of a thoughtful researcher who understands how the tested content was derived.

The fact that such tests are proxies for the larger domain adds another limitation to the scores: they are estimates only, with some amount of imprecision in each. That just means that while a majority of the time students taking similar tests on consecutive days will score similarly, some will not, and some will have scores that differ a great deal. Again, in the hands of a researcher who understands these limitations and that the scores are simply a broad signal for where to look for patterns and causes, these limitations don’t render the results useless. While they are limited, they can be useful so long as that use can tolerate the fact that scores are estimates based on a proxy and nothing more.

The primary confusion comes because the predictive test methodology produces reasonably consistent scores over time even though the test is based on a proxy for the entire domain. The resulting estimates (scores) are still sufficiently consistent over time to allow for researchers to find some value in them. But that doesn’t magically transform them into something they are not, opening up a world of uses beyond their design. Any use that assumed so would be silly.

Which is why the use of state test scores can rightfully be called silly. They are derived from the predictive test methodology yet are treated not as proxies, but as representative of an entire domain, worthy of teaching to and guiding learning when that cannot be the case. They are treated not as estimates useful for research, but absolutes to make judgements. And worst, they are treated as signals of quality when that was never in their design.

This last point has been particularly disastrous for schools that serve students from historically marginalized communities. It is a fact that if you order students as of a day on a domain such as literacy—whether via proxy or a more complete measure—and some aspect of society contributes heavily to students’ ability to acquire knowledge within that domain, the ordering will reflect that. But as of that moment no judgment is available to be made. Some set of students may be behind because of real failure in their efforts or those of the school, in which case remedies for failure should be available and applied. But they may just as well be behind due to a lack of opportunity. In that case a failure judgment and remedy would be wrong, even unethical, as it would be the wrong remedy.

Rather, a different remedy should be applied that addresses the issue of being behind as being behind, but not failure. Mislabeling the problem would be a huge mistake as it would create perceptions that may not be real, force actions that run counter to need, and justify historical biases. Even worse, labeling being behind as failure risks converting being behind to failure, in which case the current system of test-based accountability could be said to have been a contributing cause to the further suffering of those who can least afford it, to the detriment of our nation as a whole.

In short, every role educational policy asks predictive tests to play is outside and beyond their design, with a profound number of ill effects that come from their bad assumptions. Predictive tests cannot be used to judge quality or effectiveness, guide or drive instruction, or indicate the effectiveness of policy.

So, there you have it: predictive tests work by being predictive, but in order to be predictive they can’t be much else, and they certainly cannot be used as the primary tool in school accountability. The sooner we all realize that fact the better.

Monday, November 25, 2019

Response to a common set of questions on how best to use tests in an accountability system

I received a note the other day with an inquiry. It contained five question. I took the opportunity to craft a response I’ll share below, since I get these sorts of questions a lot.

Here were the questions:

1. How can a standards based adaptive assessment used throughout the year be one tool used for accountability purposes?

2. If an assessment covering a set range of standards is used throughout the school year, what other factors need to be considered to more effectively determine if students are reaching developmentally appropriate learning targets?

3. Content mastery and student progress on state standards measure student proficiency towards specific items. How should student work samples, portfolios, or other student level artifacts be used as an indication of a school’s ability to develop independent young adults?

4. In terms of accountability, what value is there in communities creating annual measurable goals aligned to a 5 year strategic plan and progress towards those goals being the basis of accountability?

These questions are similar to those I get almost every day from people understandably trying to fit square pegs into round holes. There are multiple layers to a response.

First, accountability over the years has become commensurate with test scores and objective data. When trying to gather information about learning, proficiency, or progress, test scores are presumed to be the best, and often the only source for answers. Even when other sources are considered, test scores tend to occupy the primary position in the conversation.

Second, coverage is now the dominant paradigm in learning. Coverage is now a common goal regarding a state’s content standards, and most other educational targets such as development, mastery, and progress are presumed to relate to the amount of content consumed. This is due almost entirely to the fact that the tests are said to cover a broad swath of content, and given that success is in those tests, success and coverage are presumed to be one and the same.

“Success” in such a system is in fact anything but, due entirely to the design of that system. Consider that tests that produce predictive results over time result in far less interpretive information then state accountability systems presume. The assumption on the part of the state is that a predictive test score is capable on its own of signaling success or failure, both of the student and the school. But that assumption belies the design. Predictive tests produce scores that indicate where a thoughtful educator or researcher may want to explore further, but they cannot contain within them the causes behind the indicator—in fact, that ability to make direct causal connections is removed during the design process in order to create the stability in the results over time.

Once a cause is understood it may be worthy of judgment, but until it is explored any judgments (whether good or bad) are premature, made without evidence. Any judgments made prior to an exploration of causes will make an organization less, not more effective, because absent an understanding of cause any change is a shot in the dark at best. If an effect does occur it will be presumed the shot in the dark actions caused it, and those actions will be repeated or discarded without understanding if they did or did not contribute to the effect.

Any accountability that fails to allow for the identification of causes prior to judgments will do this. I know of no other field with an accountability that commits such an egregious mistake, as it is a recipe for confusion and inefficiency.

And please know, what I describe above is baked into the current design of educational accountability so that the questions you pose are common. Underlying each question is a deep desire for effective teaching and deep learning and preparing children for their lives, as well as the need to build long term sustainable solutions. But that isn’t what the current system was designed to do, which is where the misfit comes.

The best way to see this is to recognize that there are two sides to accountability. The first is easily understood if an audience is asked to list all of the terms they associate with accountability. Most will offer up things such as responsibility, transparency, effectiveness, outcomes, and success against mission. These are all positive and any effective leader includes all of them in their leadership practice.

But there is another side to accountability that we do to organizations that refuse to be accountable. In this case accountability is imposed upon these organizations. When it is necessary to impose an accountability, the positive terms are presumed to be absent and it becomes necessary to hold people to account, to motivate through blame and shame, to test claims due to mistrust, and to inflict punishment or sanctions when compliance does not occur.

The objectives in these two accountabilities are different. In the first the goal is a long-term sustainable effect. In the latter the goal is failure-avoidance. If the goal is failure avoidance there isn’t time to think about long-term sustainable effects as you aren’t yet there. First you need to prove you can avoid failure, then you can think about doing great things.

This is why in every case other than education, imposed accountabilities are temporary, meant to resolve a crisis in the short term so the organization can get back to long-term sustainable thinking. It would be folly to think that an imposed accountability can focus on long-term excellence as that is not in its purpose nor in its design.

It is this difference that defines the tension in the questions you propose. Those questions each contain the desire for a long-term effectiveness, and yet they are being asked from within an imposed accountability environment designed to promote failure avoidance (the coverage paradigm of our current standards environment is a perfect example, as it is about control in support of failure avoidance, not long-term excellence). Our policies use language that aligns with long-term effectiveness while imposing a system designed as a short-term response to failure.

All of which is exacerbated by the selection of a predictive testing methodology they assume can do things for which it was never designed, most notably signal on its own the success of a school or the quality of a student’s performance without actually knowing the cause.

With that as a context let me now start to address the questions you pose a bit more directly.

Any test score, be it a predictive test score with its underlying psychometrics, or a classroom quiz, is a form of evidence. But in order to serve as evidence for a thing you must first have a sense of what that thing is. Evidence is necessary to answer critical questions such as: who is learning? What are they learning? Who is not learning? What is preventing learning from happening?

None of these on their own are answerable through a single evidentiary source, and each question requires sources other than test data to create a sufficient understanding regarding what to do next. Any action that attempts to treat any data source as absolute risks a decision based on incomplete evidence, which makes the decision invalid, even if by luck it happens to be the right decision. In any case it makes the organization less, not more effective, by creating dissonance between the effects that can be observed and their causes. This in turn risks promoting the wrong causes for observed effects, which is never a good thing.

Finally, accountability in effective organizations occurs at a level both the technical experts within a profession and amateurs outside it can both relate to and understand. Think about a visit to the doctor and you'll understand what I mean. Those of us who are not doctors can stare at a battery of test results for hours and still not understand what they mean. We may go on WebMD and attempt to view each indicator in isolation, but a meaningful interpretation requires a doctor with a much broader and deeper understanding then those of us who have not been through medical school possess.

The doctor does not start by taking us one by one through each of the dozens of tests, but rather, at a level we can both relate to as an amateur and a professional: the relative health of the patient. From there, the doctor can take a patient into the weeds for a deeper conversation where technical understanding is required, but through a lens appropriate to those of us without medical training.

The same is true for any profession that requires technical understanding: engineering, mechanics, computer programming, education, etc. In each of these there exists a level at which professionals and amateurs can have meaningful conversations about the work, and it is at that level that organizational accountability must occur.

It would be difficult, if not impossible, for outsiders to engage in a meaningful way with the technical part of an organization. The nature of technical information is such that the further into it you go the more likely you are to identify contradictions, counterintuitive thinking, and a lack of absolutes, which requires a technical understanding to work through and still be effective. Someone without that technical understanding is at risk of seeing the contradictions, counterintuitive thinking, and lack of absolutes as negative, as evidence of something other than what they had hoped to see.

It would be na├»ve to think that the non-technical person could dictate a response based on their limited understanding that would be meaningful which is why it isn't done—it would make the organization less, not more effective. I don’t argue with an engineer over how far his or her beams can be cantilevered over an open space, but rather start at a point we both understand—what I want the building to look like—and let the professional then do their job.

Test scores represent technical information, especially predictive test scores with their psychometric underpinnings. As such they require technicians to interpret them properly given that those interpretations will often run counter to what an untrained eye might see. For example, an untrained eye may equate a low test score with failure and insist a school act accordingly. But a technician who understands such scores would first look to causes and other evidence before arriving at any conclusion.

It may be that the evidence suggests some amount of genuine failure exists, in which case the remedies for overcoming failure should be applied. But it may also be the case that the evidence suggests the student is simply behind his or her peers given that their exposure to academic content outside of schools is limited. In that case the remedies for being behind should be applied, which are very different then the remedies for failure. To apply the wrong remedy would make the school and the student less, not more effective.

Starting with test scores as the basis for any accountability absent a technical interpretative lens creates this very risk. Test scores, contrary to popular opinion, are not simple to understand, do not produce immediately actionable results, and should not be interpreted bluntly. They are always in the weeds of an organization, part of the technical environment in which professionals work. While we should never be afraid of sharing them broadly, it is imperative that we take our outside stakeholders into them through an interpretive lens appropriate to both the amateur and the professional. The failure to do this will result in misunderstandings and frustration on all parts.

The answer to all four questions that started this off is this: educational decisions require a rich evidentiary environment that goes well beyond traditional data sources to understand the educational progress of a child. Tests can certainly be a part of that evidentiary environment, and better tests and assessments are useful in that regard and we should encourage their production. But better tests or better assessment vehicles do not solve the accountability problem.

That problem is only solved once we can shift from an imposed accountability focused on failure avoidance to a true accountability focused on long-term sustained excellence. Continuing to treat testing as our primary accountability source mires us in the technical weeds and as a result is highly likely to create misunderstandings regarding school effectiveness.

My advice: ask the right questions, treat test scores as one evidentiary source but never the only evidentiary source, question the interpretations alongside other professionals so that the best possible conclusions can be reached, and define success in any long-term plan by answering the question: what is it we hope to accomplish? rather than: what should we measure? That latter question will tie you up in knots as what is measurable empirically represents only a small percentage of what matters in a child's life and to a school.

Evidence is the proper term, as we can gather evidence on anything we need to accomplish so long as we can observe it. Focus at that level and you’ll arrive at meaningful answers to each of your questions.

Best,
John

Monday, September 23, 2019

The problem with calling charter schools "public" schools

I recently posted something to Twitter that generated quite a reaction:

"Support for charter schools by policy makers is an admission they don’t want to do the hard work to make public schools better. And their obsession with test scores that don’t mean what they think drives their narrative. True accountability solves this. Don’t you think it’s time?"

Most supported the thinking, which is simple logic: you don't charter fire houses or police stations when things go awry--you get experts in to solve whatever problems exist. When it comes to schools, policy makers went another route.

But several folks, predictably, did not agree, declaring, with noted exasperation, that charter schools are public schools and for me to say otherwise puts me into the camp of not wanting to have to improve traditional schools to the point that they can compete.

I spend my life shredding such stupid arguments, but the point I want to make here is different. Rather than argue what the label of "public school" should apply to, I instead want to perform a simple compare/contrast exercise to show that whatever you want to call them, they are not the same thing. And rather than write a book (which I could), I'll limit myself to three things.

First, traditional schools have an elected board that represents the will of a community for its schools. This elected board hires the superintendent, makes budgetary decisions, and must ensure that the district operates within all of the rules and regulations placed on them by the state. Funding for facilities is through bonds, which the electorate must approve.

Charter schools have an appointed board that sees to an overall mission of the school, or in the case of charter chains, lots of schools (and often lots of profits). The notion of a community as a physical place does not exist, and budgetary decisions are far less regulated than in a traditional school, and in some cases not regulated at all. Funding is through formulas unique to a state's charter rules, and the idea of going for a facilities bond would be silly since no community exists to vote on such a thing. As a result, facilities are included in the formula.

Second, the school tax for a community is determined by that community, so that whether perceived as fair or unfair traditional schools are funded through taxation with representation. But charter schools are funded based on the number of students enrolled from across taxing jurisdictions. In some states, the funding is actually removed from a traditional public school and given to the charter, by order of the state, thereby usurping the local taxing authority.

No matter how you try and slice it, charters represent a form of taxation without representation, something that should concern all of us, especially when that unrepresented tax goes to for profit companies. And whereas the finances in traditional schools are a matter of public record, that is not the norm for charters, which can operate almost entirely in the dark.

Third, charters get to select their students, but a traditional public school takes everyone. Even when charters go to the extreme to attempt fairness in their selection process, people have to select their way in, which all but guarantees that the students most desperately in need of a solid education, or with the greatest number of barriers to obtaining an education, are left to the traditional schools. That means that the students who are least expensive to educate are likely to end up in charters, and those most expensive to educate are likely to end up in traditional schools (I'm referring only to regular ed students here--charters don't generally take the neediest special education students, which is another issue entirely).

Funding formulas in states don't take these differences into consideration, but rather, fund per student. That leaves the charters with an abundance of resources and the traditional schools with a dearth. And then charter advocates have the gall to suggest that the competition is fair: that traditional schools that serve a more challenging set of students with less than sufficient resources should be able to compete against the over-resourced charter schools and the less challenging student populations.

The remarkable thing is how poorly the vast majority of charters do when compared to the traditional public schools when apples to apples comparisons are performed by thoughtful researchers.

Whatever you call them, and whatever your feelings towards either, charters and public schools are not the same thing--to think otherwise is simple ignorance against the facts.

Friday, March 8, 2019

The structure of accountability in effective organizations

The basic accountability structure in effective organizations is surprisingly simple. It consists of two parts: the first is a thorough accounting, and the second is an appropriate signal (thorough and appropriate are key terms here). An effective organization is defined as one that regularly achieves its mission.

The accountings that go into an accountability system are determined by what needs to be accomplished in order to achieve the organization’s mission. A hospital would consider patient outcomes. A business would consider its ability to be innovative or profitable.

Regardless, the accounting must be thorough. No one would invest in a company that released one month’s worth of financial records from one of its ten divisions. The decision to invest would be invalid. No one would have surgery in a hospital that released only its patient outcomes for a type of surgery other than what you will have. In either case the information to make an effective decision is missing.

Now consider what would happen if each of these organizations was required to change based on this information. If the company forced a change on its other nine divisions, or even within the division that provided the months’ worth of books, those decisions would be invalid at best, and at worst dead wrong. Acting upon them will make the organization less, not more efficient, and risk damaging the organization in a very real way. If the hospital forced a change based on its limited information, it risks undermining areas of surgery that are highly effective and ignoring areas of surgery in dire need of change. Again, the outcome is that the organization becomes less, not more efficient, and risks real harm to the organization and its mission. In the case of the hospital the harm extends to the patients the hospital is supposed to serve.

Now consider signals. Signals represent the forward-facing decisions that will be made to better align an organization and its outcomes with the mission. Signals consider the accountings and then act appropriately according to the mission being considered. Because organizations are complex things different signals are appropriate for different circumstances. Signals will range from being dictated (minimum or no flexibility) to being professionally determined (maximum flexibility).

Consider the signals to be made in a criminal case in which the mission of the court is justice, in which case the signal is largely dictated. The accounting, provided it is thorough, would convince a judge or jury to apply a sentence commensurate with the crime, representing a dictated signal.

Consider the signals to be made in the life of a technology company. Its mission at the start will likely focus on innovation without regard for profitability, but at some point, that mission will most certainly include the ability to return the investor’s money along with a profit. The appropriate signals should be in line with the then current mission, and to be effective will require a maximum amount of flexibility in deciding the best next steps.

Ineffective organizations are those which rarely, if ever achieve their mission. All too often such organizations violate the accountability structure used in effective organizations to an embarrassing degree. Ineffective organizations frequently rely on partial accountings, demand or make inappropriate signals, or in the worst-case scenario, both. As shown above, partial accountings lead to invalid decisions that make the organization less efficient.

Inappropriate signals directly put an organization’s mission at risk. Consider that the mission of justice would be poorly served if every court case was a crap shoot with all outcomes possible, with no regard as to the nature of crime. Consider that the mission of early innovation in a company would be poorly served if the signals were only about profit, just as the latter mission of profit would be poorly served through signals that we’re only about innovation. In each of these cases the signals are inappropriate and threaten the mission, and if left unchecked, will cause the organization to fail.

The most frustrating of all the scenarios would be one in which the organization was required to use partial accountings and then be forced to make inappropriate signals. In the case of the court example lots of innocent people would go to prison and lots of guilty people would not and the mission of justice would not be served. In the case of the tech company no one would have a clear sense of what was actually going on and the resulting signals would be unlikely to represent decisions capable of accomplishing any of the missions. Bankruptcy would be the most likely outcome.

Which brings us to the structure of current school accountability. Schools obviously have a very clear mission, which is to maximize the educational benefit for each and every child, and an effective school would be one that regularly does that. An effective accountability system would be one that created thorough accountings against that mission, and then given the complexity of educating a child and the unique circumstances of each school, allowed for the maximum amount of flexibility in the signals.

That is precisely what education does not have.

Instead, schools are told to make a partial accounting in the form of several test scores and then make a dictated signal of pass or fail, with additional dictated consequences for failure. No one can reasonably claim that a test score—or even several tests scores, even from the finest of tests—is anything other than a partial accounting against the mission. And no one can claim that the signals allow for the flexibility necessary to achieve a school’s mission.

Indeed, the structure of current school accountability is the worst of all possible scenarios: a partial accounting dictates that schools make inappropriate signals. Consider where that leaves schools: the organizations most responsible for the continuation of our civic democracy have been forced into an accountability environment that would cause the best organizations in the world to fail. In other words, an approach that would lead to bankruptcy, injustice, and outright failure in most organizations is now the manner in which we hold schools “accountable.”

Where we should take heart is in the degree to which public schools have managed to survive in spite of an embarrassingly bad accountability environment. Just imagine the possibility of what those same educators could do if placed in an accountability environment designed for effective organizations. Imagine that their accountings are no longer partial or incomplete, but rather thoroughly addressed the mission of schooling. Imagine that the signals were appropriate, allowing for the necessary flexibility to make the proper decisions regarding children and their education. Imagine the amount of recoverable effort and energy that could be converted to the mission of schooling simply by eliminating the inefficiency and frustration imposed by a bad accountability system.

We need public schools to be highly effective organizations, which means schools must have an accountability environment designed to support that goal. We can no longer afford an accountability environment designed to help schools fail, and while an argument can be made creating such an environment was never intended, that is exactly what was done. For the sake of our children let’s admit as much as a step towards a better place.

Monday, February 18, 2019

A call for action for True Accountability

One of the more interesting (and harmful) things to come out of the test-based accountability era is that we now equate testing with accountability, to the point where most people can’t see how accountability could be done without testing.

This, however, is wrong on so many levels. In my work we approach the issue from a far more practical angle. My question years ago was simple: is there a structure or framework to the way in which effective organizations do accountability? Even though hospitals, businesses, and non-profits function in dramatically different ways, and whether formal or informal, is there something they have in common when their results match their mission?

The answer was a resounding yes: a common set of patterns and frameworks is most definitely shared across effective organizations. Upon that discovery my work immediately shifted to a simple premise: if effective organizations have an accountability framework in common, and we want schools to among the most effective organizations in society, don’t schools deserve to operate under an accountability framework designed for effective organizations?

If your answer is no, stop reading. You’ll be happy with what we have, which upon closer analysis through the lens of these frameworks reveals itself to be an accountability designed to sow discord, create confusion, and separate the haves from the have nots. That is incredibly ironic since the primary argument for the system was based on equity.

But if your answer is yes, these true accountability frameworks offer a compelling solution.

Think of it this simply: which question is the most important to a parent:
1. Was my child safe yesterday?
2. Will my child be safe in school today?

It isn’t that the first is unimportant, as it informs our work, but it is the second that should occupy our accountability efforts, and that is of greatest concern to all of us. What that suggests is that accountability must have a forward-facing function, or it will fail to support a continuous improvement mindset. When accountability is only about what happened, the most likely messages will be negative given that the perfect school does not exist, and judgments about the past are always against some ideal.

When accountability takes a forward-facing approach, it puts a set of leaders in the position of leading towards the future, and when done properly, as in effective organizations, it makes those efforts transparent to anyone wishing to look.

Test-based accountability can do nothing of the sort. It offers a brief backwards-facing look that is at best a partial accounting, and fails to offer insights into the most important of the two questions: will my child be safe today?

I had the theory for a lot of this pretty well intact two and a half years ago when the Texas Association of School Administrators and I decided to partner and see if districts were interested in doing this work. We hoped for a dozen and then more than forty signed up. That group took the theories and made them live, fine tuning the old frameworks and building new ones in the very pragmatic environment of actual schooling. We now have research partners, and more and more districts interested in joining. Additional states have started to take notice, and a great many organizations are coming to a similar conclusion: that tweaking test-based accountability is a waste of time and risks the future of the majority of our children.

I have never put out a blatant call to action, until now. I am encouraging you to find a way to support this work, wherever you are. Learn the frameworks and put them into action. Fine tune them through practice and share your discoveries so that others can benefit. Stop the nonsense of thinking that a better test exists or that tweaking the existing system solves anything. It does not. We need a different way of doing school accountability, one that finally is good for our children, their communities, and their schools.