Basic machine learning terminology.
A dataset is a set of pairs—each one made up of a sample and a label.
A model accepts a sample and returns a prediction of what the sample's corresponding label might be.
A model learns how to make good predictions with a training dataset using the following process:
- The model receives samples from the training set and returns predicted labels.
- The training dataset contains the actual labels for each sample, so the model compares the predicted labels versus the actual labels.
- The training dataset improves itself in the "direction" specified by the difference between the predicted and actual labels.
A model learns to make decisions using the training set. A separate dataset called the validation set is used to compare models as they improve. A test set is another dataset used to compare different models at the end of the training process.
If we have successfully trained a model, it is ready to be used for inference—the use of the model to generate predictions from data samples whose true labels are unknown to us.
How we define "harmful" biases.
The word bias can mean many things. In this document we use "bias" to describe unfair (and often illegal) judgements against groups of humans.
We use the phrase protected attribute to refer to human characteristics that are protected by laws or ethical sensibilities. The actual laws defining protected attributes can vary across jurisdictions. For the purpose of this AI Ethics Code, the following is an (incomplete!) list of protected attributes:
- political beliefs
- biological sex
- gender, gender identity, and gender expression
- genetic information, medical history, and disability status
- family planning, pregnancy status, and family size
- nationality, national origin, and citizenship status
- race, color, and ethnicity
We use the phrase harmful bias to refer to any mechanism that could lead to judging a person—whether real or imaginary—based on any protected attribute. Harmful biases can originate from many places, including:
- intentionally-biased human decisions. Many organizations, such as banks, schools, employers, and governments made biased decisions as institutional policy—such as redlining. If you wanted to train a model on historical mortgage data or real estate sales, you would have to find a way to control for lenders' history of institutionalized racism.
- unconsciously-biased human decisions. Nowadays, most people will not admit to consciously holding harmful biases, but academic research like Harvard's Implicit Association Test (IAT) claim to show that nearly everybody acts on unconscious biases. Because of this, it may be impossible to guarantee that any individual human can make unbiased decisions.
- hardware, software, or algorithm design. For example, the history of color photography—both film and digital—has been optimized specifically for photographing subjects with white skin. Some of that bias is within the chemistry of the film and the hardware of the camera.
- within the weights of a statistical model. A model may have tens of billions of parameters, which cannot easily be directly studied or analyzed. Sometimes the statistical properties of a model's inputs and outputs can be used to suggest bias exists, but it is an unsolved problem to prove that a sophisticated model has no harmful biases at all.
Most of this AI Ethics Code is focused around minimizing harmful biases when training machine learning models.
Limitations of this AI Ethics Code
This AI Ethics Code does not apply to non-public experiments.
The principles in this document only apply to machine learning models that are either:
- used as part of externally-facing software products.
- used in internal tools to help employees and contractors make business decisions.
- used to generate data that we release to the public.
This AI Ethics Code does not apply to models trained or used for internal machine learning research. For example, we may want to intentionally train a "research model" with harmful biases, specifically to get a better understanding of how models learn biases. Such research models and their outputs should not be published or integrated into any of our internal or external software products.
This AI Ethics Code is not enforced against our vendors.
This AI Ethics Code also does not apply to machine learning models sold, leased, or licensed to us from vendors. Because their datasets, training methods, and policies are proprietary, we would not be able to enforce our AI Ethics Code against them.
We will not train models on data samples that could reveal a human's protected attributes.
We will not train models directly against datasets where the samples can reveal—for example—a person's skin color. Examples of such datasets include:
- music album art.
- image data extracted from music videos.
- text other than music lyrics or song metadata.
- socioeconomic or biographical data, such as language, locale, time zone, employment history, zip code, or race.
While there may be ways to reduce harmful biases learned by models trained on these datasets, we cannot take the risk.
- Models that predict similarity and/or identify copyright infringement. This principle forbids us from training a model that predicts "quality" or commercial success from album art. However, we still may want to train image-hashing models that can help identify whether one artist has plagiarized their album art from another artist.
- Models trained on raw audio. In many cases, we may want to understand the biases in a raw audio dataset, especially if they directly correlate with what type of music listeners enjoy. While we don't endorse such biases or believe they cause good music to be made, it is important that we understand them.
We will not train models on "private" data samples without permission.
We will not train models on unreleased music without the artists' consent. There are many ways we may end up in possession of unreleased music, such as:
- demo tapes sent to us by artists.
- music created but not yet released by artists already signed to Neocrym.
- by operating online tools, web applications, or cloud storage services where artists can upload music that they are working on.
This rule exists so that we do not violate confidentiality with entities that send us private data.
The exceptions to this principle are all centered around detecting bad behavior. We reserve the right to use private data samples for:
- detecting malicious inputs. For example, if you send an email to someone at Neocrym, we may mark it as spam and use it to train a spam classifiers. But we wouldn't use the emails you send us to train a model that generates text—unless you consented to such a use case. Similarly, any credit card purchases you make with Neocrym are considered to be private, but data from suspicious or fraudulent transactions may be used to improve fraud detection models.
- investigating potential data leaks. For example, if we have a private server containing unreleased music and private data, we reserve the right to calculate hashes and fingerprints from that private data to identify if our private server has been compromised and identify your/our private data being distributed on the Internet.
We will not train models that predict labels containing protected attributes.
For example, we will not train a model that tries to predict whether the audio of a given song contains a male or female singer
However, this principle does allow training models on the same audio, but with a different predicted label—such as genre or popularity instead of gender—unless forbidden by another principle in this document.
We will minimize our usage of individual human labels on creative datasets.
At Neocrym, our goal is to find underground songs that have the potential to become tomorrow's hit song. This sets up an obvious ethical problem: Every individual has an opinion of what a hit song sounds like, but that opinion may be tainted by harmful biases.
We want to avoid training a model based on the opinions of a small number of employees or data-labelers. As such, we only teach a model about popularity using features extracted from the aggregate behavior of tens of millions of music fans.
Of course, the collective behavior of millions of people still encodes a bias—potentially even a harmful bias. But in the interest of helping artists make money, we need to model and analyze such collective biases.
The important part of this principle is to not let a few humans define broad opinion-based predictions of creativity, quality, or potential.Other than that, there are numerous exceptions where we do need to attach human labels to our data samples:
- identifying fraud. Many artists increase their streaming site popularity metrics using fraudulent methods, such as using bots that pretend to be real listeners. When we train models on these popularity metrics, we risk training a model that is fooled by fraud. Because fraudsters continually innovate beyond the automated systems meant to detect fraud, our best hope for detecting fraud is by using human analysts to label tracks or artists as suspicious.
- identifying offensive content. Because the concept of what is offensive is so specific to human cultures, it would be impossible for us to identify offensive content with only machine-generated models. There are many “dog whistle” phrases like the 14 Words that humans recognize as a racist slogan, but a computer may not.
- recommendations. Recommender systems are trained from the aggregate behavior of many users, but each user's recommendations are heavily influenced by their own behavior. This ethical principle should not prevent us from deploying recommender systems.
- personalization. We might create a model capable of generating or modifying music, but individual artists using the model may want to customize it as per their own preferences. Artists can customize a generative model by labeling a few data samples and using them to fine-tune the model's decoder.
- benchmarking. Lossy audio codecs like MP3 dramatically reduce the size of an audio file by discarding audio information that humans cannot hear. If we are developing software with psychoacoustic properties, we need to test it against human ears.
- CAPTCHAs. Many products like reCAPTCHA distinguish humans from bots by having them solve visual or audio puzzles. The resulting human-generated answers are used to train models solving problems like optical character recognition and speech-to-text. This principle does not prevent us from generating, using, or solving CAPTCHAs or similar products.
- near-deterministic or factual labels. There are many simple labels, such as whether a song contains any human voices at all, that are not provided from where we get our data. These labels are not opinions governing creativity or potential—instead they are deterministic facts that machines need human labels for.
Classification and regression models
A model should not be used to directly reject a candidate.
Many Ethical ML "cautionary tales" involve companies creating models to approve or deny applicants for jobs, loans, insurance policies, and so forth. These models learned from historical data that contained illegal biases and therefore made illegally biased data going forward.
We recognize the problematic history of using machine learning to judge people, and it makes us very careful about what we do. Some things we keep in mind when we do our work:
- A model never has the final say. We use machine learning models to search and sort through tens of millions of songs that have been ignored by the rest of the music industry. Our use of machine learning makes us very good at finding underestimated talent, but no judgement from a model guarantees or prevents us from working with an artist.
- We recognize that all models are biased, whether we know it or not. Philosophically speaking, there is no such thing as a human or machine learning model that is literally absent of any type of bias whatsoever. Whenever we see the output from a model, we have to consider whether the output is the result of an adversarial and harmful bias.
- We use models to counteract our human biases. The last hundred-odd years of music industry history is filled with music executives demonstrating unethical biases involving race, gender, socioeconomic status, and so forth. While we recognize that our models probably have biases too, we build and maintain models so they can challenge and counteract the human biases we hold.
Unfortunately, there are many cases where a model may directly decide if we do business with a person, and we cannot change our usage of such a model without creating a major risk for our business.
We may use such models for:
- defensive computer security tools. We may use many different models to protect our company, our employees, and our property. It is infeasible for a human to review decisions from such models. Some examples include models trained to filter spam or computer viruses.
- detecting copyright infringement. At Neocrym, we look for undiscovered songs from independent artists that have an auditory resemblance to popular hit songs. However, the easiest way to create a song resembling a hit is to plagiarize from a hit song. It could be incredibly harmful for our business if we end up financing, promoting, or releasing a work that infringes upon another work. As such, we may scan songs using copyright detection models without validating every single prediction with a human-led investigation.
We will not train models with the intent of identifying specific individuals.
- Neocrym will not train models that can identify a person from a given image, audio, or video as input.
- Neocrym will not train models that generate or collect faceprints of voiceprints for the purpose of identifying specific individuals. In this context, a faceprint or voiceprint is a piece of data used to identify a person—in contrast to an audio fingerprint that is used to identify a specific piece of audio.
- Neocrym will not train stylometry models for the purpose of recognizing the author of a passage of text.
As previously noted, this principle does not forbid training models that can recognize other songs for the purpose of copyright detection. A satisfactory copyright detection system must know the names of the rightsholders of an infringed work. Therefore, a copyright detection system would have some limited ability to connect an arbitrary song to an individual's identity. However, this is a necessary compromise to avoid infringing on copyrights.
We will not use generative models to depict an existing human or their likeness without their consent.
We will only use generative models to mimic somebody's appearance, voice, or writing style with their consent.
We will prevent our generative models from infringing upon its own training dataset.
Many machine learning models sometimes output a few of their training samples verbatim. Because of this, there is always a risk that a generative model trained on copyrighted data could end up committing copyright infringement.
We will not use generative models to generate pornographic content.
This principle forbids creating and/or using generative models to generate any human depiction that is pornographic in nature.There are numerous unethical and legal issues with using generative models specifically to create or modify pornography:
- Issues involving child pornography. Pornographic depictions of minors—even fictional depictions—are illegal in much of the industrialized world. There is no possible way to prevent a porn-generating model from generating images that are either considered by law enforcement to be fictional child pornography or mistaken as real child pornography.
- Plausible deniability for real crimes. The existence of a model capable of generating fictional child pornography (or fictional depictions of crimes like sexual assault) can be used to cast doubt on genuine evidence in real criminal cases. Eventually, generative models will become good enough to replicate the exact statistical characteristics of real images, at which point no human or computer will be able to visually inspect the difference between fake photographic evidence and real photographic evidence.
- Potential for blackmail. "Revenge porn" describes pornographic content that is distributed without the consent of everybody depicted. Adversaries often publish—or threaten to publish—revenge porn specifically to blackmail a person depicted in the porn. If a porn-generating model can produce sufficiently-realistic images of a victim, then the model could easily become a tool for humiliation or blackmail.
This AI Ethics Code does not apply to humans creating pornography without the use of generative machine learning models. Neocrym may have other restrictions on employees' production or consumption of pornography, but such restrictions are not in this document.Additionally, this AI Ethics Code does not forbid the development and use of generative models that are not specifically related to pornography, even if they can be used on pornography. For example, photo editing software can use models to make human faces more attractive—such as whitening teeth or removing blemishes. Such a model functions identically whether the photo depicts a clothed person or a nude person.