Comment by Terr_

3 months ago

I feel like we need to disentangle a bunch of layers here. To take a stab at a quick categorization:

1. Policy bias, like if someone put in a system prompt to try to trigger certain outcomes.

2. Algorithmic/engineering bias, like if a vision algorithm has a harder time detecting certain details for certain skin tones under certain lighting.

3. Bias inside the data set which is attributable to biased choices made by the company doing the curation.

4. Bias in the data set which is (unlike #3) mostly attributable to biases in the external field or reality.

I fear that an awful lot of it is #4, where these models are highlighting distasteful statistical trends that already exist and would be concerning even if the technology didn't exist.

1 comment

Terr_

janalsncm 3 months ago

Absolutely. Our language is not helping here. The term “bias” is completely overloaded.