GAO recently developed a framework for the use of artificial intelligence (AI) by federal agencies. At the federal level, AI has applications across a variety of sectors, including transportation, healthcare, education, finance, defense, and cybersecurity.
The framework consists of four complementary principles: governance, data, performance and monitoring. The purpose of the framework is to ensure accountability and responsible use of AI in government programs and processes.
“AI is evolving at a pace at which we cannot afford to be reactive to its complexities, risks, and societal consequences,” the report said. “It is necessary to lay down a framework for independent verification of AI systems even as the technology continues to advance.”
The reports states that when implementing AI systems, assessing technical performance is needed to ensure that the AI system solves the problem initially identified and uses the appropriate data sets for the problem. Without this assurance, an unintended consequence may occur.
According to GAO, one example of an unintended consequence is the use of predictive policing software to identify likely targets for police intervention. The intended benefits of such software are to prevent crime in specific areas and improve resource allocation of law enforcement.
GAO’s report detailed a study, where researchers demonstrated that the tool disproportionately identified low-income or minority communities as targets for police intervention regardless of the true crime rates. “Applying a predictive policing algorithm to a police database, the researchers found that the algorithm behaves as intended,” the report said. “However, if the machine learning algorithm was trained on crime data that are not representative of all crimes that occur, it learns and reproduces patterns of systemic biases.” According to the study, the systemic biases can be perpetuated and amplified as police departments use biased predictions to make tactical policing decisions.
It is also worth remembering that National Institute of Standards and Technology tests of facial recognition technology found that it generally performs better on lighter-skinned men than it does on darker-skinned women, and does not
perform as well on children and elderly adults as it does on younger adults. These differences could result in more frequent misidentification for individuals within certain demographics.
A further example was an AI predictive model that was used in healthcare management where researchers compared Black and White patients’ health risk scores but due to healthcare expenses being used as a proxy, Black patients were receiving a different risk score.
“However, healthcare expenses do not represent health care needs across racial groups, because Black patients tend to spend less on healthcare than White patients for the same level of needs, according to a study,” GAO said. “For this reason, the model assigned a lower risk score to Black patients, resulting in that group under-identified as potentially benefiting from additional help, despite having similar healthcare needs.”
Through governance practices, management will be able to manage risk, highlight the importance of integrity and ethical values and ensure compliance with regulations and laws, the report said. At the organizational level, governance will help, “incorporate organizational values, consider risks, assign clear roles and responsibilities, and involve multidisciplinary stakeholders,” GAO said.
At the system level of governance, this practice will help entities ensure that AI meets performance requirements and achieves its intended outcomes, the report said. According to GAO, three key practices will include technical specifications to ensure the AI’s intended purpose, ensure the system complies with regulations and promotion of transparency with external stakeholders over design, operation and limitations.
Through data practices, entities will be able to, “ensure quality, reliability, and representatives of data sources, origins, and processing,” the report said. Data used for model development of AI systems will consist of documenting sources and identifying origins of data, assessing reliability of data, assessing data variables and assessing the use of synthetic, imputed or augmented data.
Regarding data used for system operations, three key practices were established to assess interconnectivities and dependencies of data streams, assessing quality or any potential biases and assessing data security for AI systems.
The report emphasized the importance of data security and privacy and GAO said entities that are using or plan to implement AI systems should conduct data security assessments, including risk assessments, have a data security plan, and conduct privacy assessments. Any deficiencies or risks identified in testing for security and privacy should also be addressed.
Addressing performance, GAO has established practices that are intended to produce results that are consistent with program objectives. At the component level, some practices put in place for performance would document model and non-model components, define performance metrics that are consistent, assess the performance of each component and assess the outputs of each component.
At the system level, the same practices from the component level will be applied with the addition of identifying potential biases or other societal concerns resulting from the AI system and developing procedures for human supervision of AI in order to ensure accountability, the report said.
The last principle, monitoring, would ensure reliability and relevance over time, GAO said. The practices established for continuous monitoring of performance would consist of developing plans for continuous or routine monitoring of the AI system, establish the range of data and model drift that represent the desired results and documenting the results from monitoring activities.
In addition, when assessing sustainment and expanded use, GAO said the established practices would assess the utility of the AI system to ensure its relevance, and identify conditions where the AI system may be scaled or expanded beyond its current use.
GAO also highlighted the issue of talent defecit in the federal workforce, quoting a report from the National Security Commission on Artificial Intelligence, which said “The human talent deficit is the government’s most conspicuous AI deficit and the single greatest inhibitor to buying, building, and fielding AI-enabled technologies for national security purposes. This is not a time to add a few new positions in national security departments and agencies for Silicon Valley
technologists and call it a day. We need to build entirely new talent pipelines from scratch.”