[2023 kcdc] the elephant in your data set – avoid bias in machine learning

Posted on June 22, 2023 by Jeanne Boyarsky

Speaker: Michelle Frost

For more, see the table of contents.

Notes

Intersectionality wheel of privileged. Many spokes and range from power to erased to marginalized. Used the version posted here
Bias – inclination or prejudice for or against one person or group
ML Bias – systematic error in the model itself due to assumptions
Sometimes bias is necessary – inductive bias – assumptions combined with training examples to classify
Models with high bias oversimplify the model
Each stage has potential harmful bias
Bias feeds back into model
In ML, when something looks two good to be true, it probably is

Points of bias

Historical – prejudice in world as it exists today. Gave example from ChatGPT where assumed a nurse was female even when replaced pronouns. Full example here
Representation bias – Sample under-represents part of population. Can’t make effective predictions for that group. Article describing. “Solved” by dropping gorillas as a label
Measurement bias – using a proxy to represent a construct. Problem if oversimplifying or accuracy varies across groups. Compas (Correctional Offender Management Profiling for Alternative Sanctions) example. Data measures policing not just the offender.
Aggregation bias – one size fits all model assumes mapping inputs to labels is consistent. For example, could mean something different across cultures. Such as LSD being Lake Shore Drive in Chicago and not a drug. Or racial differences for HbA1c
Learning bias – modeling choice may prioritize one objective which damages another. Such as Amazon’s recruiting tool discriminating against women
Evaluation bias – benchmark data does not represent the population. Might make sense in some scenarios. Project Gender Shades analyzed differences in different tools.
Deployment bias – model attended to solve one problem, but used a different way. Make a hook for tuna and use it on a shark. Child abuse protection tool fails poor families.

Simpson’s paradox

Other attributes are a proxy for the thing leaving out
Association disappears, reappears or reverses when divide population

Terms

Protected class – category where bias is relevant
Sensitive characteristics – algorithmic decisions where bias could be factor
Disparate treatment
Disparate outcome/impact
Fairness – area of research to ensure biases and model inaccuracies do not lead to models that treat individuals unfavorable due to sensitive characteristics.

Metrics

Demographic partiy – decisions/outcomes independent of protected attribute. Does not protect all unfairness
Equal odds – decision independent of protected attributes. True and false positive rates must be equal
Equal opportunity – like equal odds but only measures fairness for true positive rates

Demo

A popular (bad) data set is “adult data set”. I think i this one.
Not balanced by gender, race, country

Book recommendations

Weapons of math destruction
Biased
The alignment Ppoblem
Invisible Women
The Big Nine
Automating Inequality

My take

The types of bias and examples were interesting. Good end to the day. The demo graphs provided the point about biased data nicely.

[2023 kcdc] With Great Power Comes Great Responsibility: The Ethics of AI

Posted on June 22, 2023 by Jeanne Boyarsky

Speaker: Matthew Renze

Twitter: @matthewrenze

For more, see the table of contents.

History

Tech has a tendency to be abused

land – slaves
mechanized war fare – expand influence
cyberware – mass surveillance

Alice and Bob

Need to decide if want to get cat or dog for kids.
One researches cats and one dogts.
Get into info bubble thinking cat lovers hate dogs and vice versa and mad at each other
Then talk to real people, learn people like both and get a cat and a dog.
A generation later they lose their jobs due to robots/AI. Their kids see lots of jobs because tech savvy.
Kids convince parents to upskill and get new job
Another generation later grandkids want biological augmentation and to marry an AI.
Feel lost in world no longer recognize
Learn about technology and see it is an evolution. Learn from grandchildren.

Today

When search for something, get more of it.
Then info bubble/echo chambers
Goal is to maximize engagement. This results in more extreme content so people click
Lose privacy – ex: shopping data predict pregenancy
Can deanonymomize data with data of birth, sez and zip code
Little privacy now and soon a lot less
Algorithmic bias – ex: racially bias criminal risk score, males preferred in resumes

AI

Uncanny valley – distrust things that almost like us
Hallucination – making up believeable, but false info
Misinformation at scale
Lack of AI literacy

What can we do

Delete cookies
Incognito mode
Throwaway emails
Stop using “click holes” to get pulled down rabbit holes
Opt out
Privacy regulations
Limit/stop using social media
Talk to other people

AI Developers

Eliminate bias in data – diverse datasets, exclude protected attributes, retrain algorithm over time
Be able to explain how AI made decision. Use decision tree vs neural network where can.
Let users choose how much error they allow
Don’t allow full autonomous

Fight misinformation

Who is the author/publisher?
What are their sources?
How strong is the evidence?

Near Future

Significant unemployment – simple/repetitive/costly jobs. Expect 20%+ jobs to go away by 20230 and be replaced by other higher tech jobs
Labor market unprepared for rapid change
Society is unprepared for change.
Many people left behind in poverty.
Synthetic media – indistinguishable from human data. Propaganda/misinformation at scale. Deep fakes. Deep nude (remove clothes without permission), etc
With 10 likes, AI knows you well as colleague.
Surveillance capitalism – can’t detect being manipulated
Greater social stratification – income gap
Safety issues – does self driving car protect driver or pedestrian
Autonomous weapons – currently a human is in the loop

Solutions

Educate everyone/AI literacy, Basics of ML, DL (deep learning), RL (reinforcement learning)
Job retraining
Retirement options for those too old to reskill
Mandatory higher ed – mandatory high school was controversial
Universal basic income/negative income task
Deep fake detection – arms race
Digital alibi – so can prove what doing at all times and therefore not in fake ideo
Blockchain for everything so have complete audit trail
Default mode of skepticism

Further Future – Speculative

AGI (artificial general intelligence) – at least as smart as average person
Improve health
Solve biggest problem – climate change, politics, government
Humans could become obsolete – ex: horses became obsolete to farms. “Peak horse” was in 1915
Collapse of modern institutions – could break capitialism.
Changes already faster than society can adapt. What happens when new discoveries every day?
Dystopian future – authoritarianism, communism, fascism, AI religion, AI super bureaucracy
Or a better AI based government
ASI (artificial super intelligence) – if create AGI, intelligence exposion can happen fast. AGI can rewrite its own code.
Alignment problem – how do we align human and AI values. Reward hacking – find loopholes
AI run amok – what happens if robot mine astroids. When does it stop
Conflicts – are we pets, ants, raw materials, competition, a threat?

Positives

We evolved for short bursts of stress.
Modern society is chronic stress
Be mindful with tech
Respect AI
Don’t fear/fight change
Use tech when beneficial and skip when not
Reward AI goal states
Keep ability to intervene if decision doesn’t align

Long run

Peacefully coexist with AI
AI wins
AI and humanity merge – most likely option
Humanity ends itself

Merge

No “us vs them” problem.
Phones an extension of us
Younger generation willing to merge with mind
VR/AR glasses
Gene editing
Brain/computer interfaces
Next version of people likely to be vary different

My take

The Alice and Bob stories are fun. There was a ton of information. It went very fast and definitely need time to process. I expected more discussion of ethics rather than covering “everything” but I’m happy with how it turned out.

[devnexus 2022] meta-modern software architecture

Posted on April 13, 2022 by Jeanne Boyarsky

Speaker: Neal Ford from thoughtworks

@neal4d

Link to table of contents

———————

Were architectures come from

Architecture is reactive
Someone starts doing something, then others do
Once a bunch doing, named (after the fact)
Reflection on how doing software development at the time
Once in an architecture, can watch how it grows and changes

Eras

Victorian – 1801-1900 – science, cassifying natural world
Modernism – 1890-1945 – industriial revolution, explosive growth of cities, abstract art, radio.
Post-modernism – 1946-1990 – push back about modernism, irony, questioning everything, television, Seinfield’s ”never hug, never learn”
Post-post modernism or metamodernism – 1990-present – internet
Naming things is hard. Not just in software. Modernism is bad choice of name because what would come next

Metamodern

In 1989, to find out Chicago weather would need to watch Weather Channel and wait for it to cycle around or go to library. Now pull it up onlie.
In 1989, could read a few books and know pretty much everything about wine. Now too much info and keep generating more.
Holism – view various systems as whole
Parks and recreating is first meta-modern show
Breaking bad – colo driven – yellow is safe and purple is bad
Return to sentimentaliitiy. Can’t live on ironism alone

Software architecture

microservices – one of most popular pages on Martin Fowler’s website. Say what it is and more importantly, what it isn’t.
First law of software architecture: ”Everything in software architecture is a tradeoff”. If you haven’t encountered yet, will be in the future
Reuse reduces complexity but comes with high coupling
Metamodern software architect needs to do tradeoff analysis. Ex: things that change slowly are good for reuse such as frameworks and OS
service mesh and sidecar pattern – orthogonal coupling

Books

”Fundamentals of Software Architecture”
”Software Architecture: the Hard Parts”
”Data Mesh”

Forces

Consistency – atoic, eventual
Communication – sync, asych
Coordination – orchestrated, choreography
8 possiblities by choosing one of each. ex: one is a monolith. All 8 can exist as pattern or antipattern.
Named them transactional sagas. epic, fantasy fiction, fairy tale, parallel, phone tag, horror story time travel, anthology

Richard Feinman

Computers used to be a room full of people (usually women) calculating things
Feyman added specialization and paralleliation. Some people are better at some tasks than others. And recovering from problems
1945 – atomic bomb blast is what shifted eras
reonsider why continuing to do thing. revisit when reasons change

Internet

Pushed us to net era
Volkswagon used software to cheat on emissions test. Some people knew actively working to break the law
Facebook keeps getting busted for doing bad thigs – data breaches, illegally tracking users, Cambridge Analytica, using two factor for marketing.
Last week, Facebook made up a meme that TikTok that students slapping teachers. Then it became a self fulling prophacy

Finance and ethics

Modernism – double enry accounting
Post-modern – quants
Metamodern humane corporation, ethics. Recognize all connected to each other
Don’t want to create something cool and spening rest of career on appology tour
Apple, Google employees pushed back

My take

Fun start to the day. I hadn’t heard of the ”saga” approach before. Googling, at least some of them see to be a real thing. (and all are from ”the hard parts” book I also increased my book reading list. The end felt rushed. Maybe because started late?

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky

Java/J2EE Software Development and Technology Discussion Blog

Tag Archives: ethics

[2023 kcdc] the elephant in your data set – avoid bias in machine learning

Notes

Points of bias

Simpson’s paradox

Terms

Metrics

Demo

Book recommendations

My take

[2023 kcdc] With Great Power Comes Great Responsibility: The Ethics of AI

History

Alice and Bob

Today

AI

What can we do

AI Developers

Fight misinformation

Near Future

Solutions

Further Future – Speculative

Positives

Long run

Merge

My take

[devnexus 2022] meta-modern software architecture

Notes

Points of bias

Simpson’s paradox

Terms

Metrics

Demo

Book recommendations

My take

Share this:

History

Alice and Bob

Today

AI

What can we do

AI Developers

Fight misinformation

Near Future

Solutions

Further Future – Speculative

Positives

Long run

Merge

My take

Share this:

Share this: