Archive for the ‘Machine Learning’ Category

What a machine learning tool that turns Obama white can (and cant) tell us about AI bias – The Verge

Its a startling image that illustrates the deep-rooted biases of AI research. Input a low-resolution picture of Barack Obama, the first black president of the United States, into an algorithm designed to generate depixelated faces, and the output is a white man.

Its not just Obama, either. Get the same algorithm to generate high-resolution images of actress Lucy Liu or congresswoman Alexandria Ocasio-Cortez from low-resolution inputs, and the resulting faces look distinctly white. As one popular tweet quoting the Obama example put it: This image speaks volumes about the dangers of bias in AI.

But whats causing these outputs and what do they really tell us about AI bias?

First, we need to know a little a bit about the technology being used here. The program generating these images is an algorithm called PULSE, which uses a technique known as upscaling to process visual data. Upscaling is like the zoom and enhance tropes you see in TV and film, but, unlike in Hollywood, real software cant just generate new data from nothing. In order to turn a low-resolution image into a high-resolution one, the software has to fill in the blanks using machine learning.

In the case of PULSE, the algorithm doing this work is StyleGAN, which was created by researchers from NVIDIA. Although you might not have heard of StyleGAN before, youre probably familiar with its work. Its the algorithm responsible for making those eerily realistic human faces that you can see on websites like ThisPersonDoesNotExist.com; faces so realistic theyre often used to generate fake social media profiles.

What PULSE does is use StyleGAN to imagine the high-res version of pixelated inputs. It does this not by enhancing the original low-res image, but by generating a completely new high-res face that, when pixelated, looks the same as the one inputted by the user.

This means each depixelated image can be upscaled in a variety of ways, the same way a single set of ingredients makes different dishes. Its also why you can use PULSE to see what Doom guy, or the hero of Wolfenstein 3D, or even the crying emoji look like at high resolution. Its not that the algorithm is finding new detail in the image as in the zoom and enhance trope; its instead inventing new faces that revert to the input data.

This sort of work has been theoretically possible for a few years now, but, as is often the case in the AI world, it reached a larger audience when an easy-to-run version of the code was shared online this weekend. Thats when the racial disparities started to leap out.

PULSEs creators say the trend is clear: when using the algorithm to scale up pixelated images, the algorithm more often generates faces with Caucasian features.

It does appear that PULSE is producing white faces much more frequently than faces of people of color, wrote the algorithms creators on Github. This bias is likely inherited from the dataset StyleGAN was trained on [...] though there could be other factors that we are unaware of.

In other words, because of the data StyleGAN was trained on, when its trying to come up with a face that looks like the pixelated input image, it defaults to white features.

This problem is extremely common in machine learning, and its one of the reasons facial recognition algorithms perform worse on non-white and female faces. Data used to train AI is often skewed toward a single demographic, white men, and when a program sees data not in that demographic it performs poorly. Not coincidentally, its white men who dominate AI research.

But exactly what the Obama example reveals about bias and how the problems it represents might be fixed are complicated questions. Indeed, theyre so complicated that this single image has sparked heated disagreement among AI academics, engineers, and researchers.

On a technical level, some experts arent sure this is even an example of dataset bias. The AI artist Mario Klingemann suggests that the PULSE selection algorithm itself, rather than the data, is to blame. Klingemann notes that he was able to use StyleGAN to generate more non-white outputs from the same pixelated Obama image, as shown below:

These faces were generated using the same concept and the same StyleGAN model but different search methods to Pulse, says Klingemann, who says we cant really judge an algorithm from just a few samples. There are probably millions of possible faces that will all reduce to the same pixel pattern and all of them are equally correct, he told The Verge.

(Incidentally, this is also the reason why tools like this are unlikely to be of use for surveillance purposes. The faces created by these processes are imaginary and, as the above examples show, have little relation to the ground truth of the input. However, its not like huge technical flaws have stopped police from adopting technology in the past.)

But regardless of the cause, the outputs of the algorithm seem biased something that the researchers didnt notice before the tool became widely accessible. This speaks to a different and more pervasive sort of bias: one that operates on a social level.

Deborah Raji, a researcher in AI accountability, tells The Verge that this sort of bias is all too typical in the AI world. Given the basic existence of people of color, the negligence of not testing for this situation is astounding, and likely reflects the lack of diversity we continue to see with respect to who gets to build such systems, says Raji. People of color are not outliers. Were not edge cases authors can just forget.

The fact that some researchers seem keen to only address the data side of the bias problem is what sparked larger arguments about the Obama image. Facebooks chief AI scientist Yann LeCun became a flashpoint for these conversations after tweeting a response to the image saying that ML systems are biased when data is biased, and adding that this sort of bias is a far more serious problem in a deployed product than in an academic paper. The implication being: lets not worry too much about this particular example.

Many researchers, Raji among them, took issue with LeCuns framing, pointing out that bias in AI is affected by wider social injustices and prejudices, and that simply using correct data does not deal with the larger injustices.

Others noted that even from the point of view of a purely technical fix, fair datasets can often be anything but. For example, a dataset of faces that accurately reflected the demographics of the UK would be predominantly white because the UK is predominantly white. An algorithm trained on this data would perform better on white faces than non-white faces. In other words, fair datasets can still created biased systems. (In a later thread on Twitter, LeCun acknowledged there were multiple causes for AI bias.)

Raji tells The Verge she was also surprised by LeCuns suggestion that researchers should worry about bias less than engineers producing commercial systems, and that this reflected a lack of awareness at the very highest levels of the industry.

Yann LeCun leads an industry lab known for working on many applied research problems that they regularly seek to productize, says Raji. I literally cannot understand how someone in that position doesnt acknowledge the role that research has in setting up norms for engineering deployments.

When contacted by The Verge about these comments, LeCun noted that hed helped set up a number of groups, inside and outside of Facebook, that focus on AI fairness and safety, including the Partnership on AI. I absolutely never, ever said or even hinted at the fact that research does not play a role is setting up norms, he told The Verge.

Many commercial AI systems, though, are built directly from research data and algorithms without any adjustment for racial or gender disparities. Failing to address the problem of bias at the research stage just perpetuates existing problems.

In this sense, then, the value of the Obama image isnt that it exposes a single flaw in a single algorithm; its that it communicates, at an intuitive level, the pervasive nature of AI bias. What it hides, however, is that the problem of bias goes far deeper than any dataset or algorithm. Its a pervasive issue that requires much more than technical fixes.

As one researcher, Vidushi Marda, responded on Twitter to the white faces produced by the algorithm: In case it needed to be said explicitly - This isnt a call for diversity in datasets or improved accuracy in performance - its a call for a fundamental reconsideration of the institutions and individuals that design, develop, deploy this tech in the first place.

Update, Wednesday, June 24: This piece has been updated to include additional comment from Yann LeCun.

More here:
What a machine learning tool that turns Obama white can (and cant) tell us about AI bias - The Verge

SLAM + Machine Learning Ushers in the "Age of Perception – Robotics Business Review

The recent crisis has increased focus on autonomous robots being used for practical benefit. Weve seen robots cleaning hospitals, delivering food and medicines and even assessing patients. These are all amazing use cases, and clearly illustrate the ways in which robots will play a greater role in our lives from now on.

However, for all their benefits, currently the ability for a robot to autonomously map its surroundings and successfully locate itself is still quite limited. Robots are getting better at doing specific things in planned, consistent environments; but dynamic, untrained situations remain a challenge.

Age of PerceptionWhat excites me is the next generation of SLAM (Simultaneous Localization and Mapping) that will allow robot designers to create robots much more capable of autonomous operation in a broad range of scenarios. It is already under development and attracting investment and interest across the industry.

We are calling it the Age of Perception, and it combines recent advances in machine and deep learning to enhance SLAM. Increasing the richness of maps with semantic scene understanding improves localization, mapping quality and robustness.

Simplifying MapsCurrently, most SLAM solutions take raw data from sensors and use probabilistic algorithms to calculate the location and a map of the surroundings of the robot. LIDAR is most commonly used but increasingly lower-cost cameras are providing rich data streams for enhanced maps. Whatever sensors are used the data creates maps made up of millions of 3-dimensional reference points. These allow the robot to calculate its location.

The problem is that these clouds of 3D points have no meaning they are just a spatial reference for the robot to calculate its position. Constantly processing all of these millions of points is also a heavy load on the robots processors and memory. By inserting machine learning into the processing pipeline we can both improve the utility of these maps and simplify them.

Panoptic SegmentationPanoptic Segmentation techniques use machine learning to categorize collections of pixels from camera feeds into recognizable objects. For example, the millions of pixels representing a wall can be categorized as a single object. In addition, we can use machine learning to predict the geometry and the shape of these pixels in the 3D world. So, millions of 3D points representing a wall can be all summarized into a single plane. Millions of 3D points representing a chair can be all summarized into a shape model with a small number of parameters. Breaking scenes down into distinct objects into 2D and 3D lowers the overhead on processors and memory.

What excites me is the next generation of SLAM that will allow robot designers to create robots much more capable of autonomous operation in a broad range of scenarios. It is already under development and attracting investment and interest across the industry.

Adding UnderstandingAs well as simplification of maps, this approach provides the foundation of greater understanding of the scenes the robots sensors capture. With machine learning we are able to categorize individual objects within the scene and then write code that determines how they should be handled.

The first goal of this emerging capability is to be able to remove moving objects, including people, from maps. In order to navigate effectively, robots need to reference static elements of a scene; things that will not move, and so can be used as a reliable locating point. Machine learning can be used to teach autonomous robots which elements of a scene to use for location, and which to disregard as parts of the map or classify them as obstacles to avoid. Combining the panoptic segmentation of objects in a scene with underlying map and location data will soon deliver massive increases in accuracy and capability of robotic SLAM.

Perceiving ObjectsThe next exciting step will be to build on this categorization to add a level of understanding of individual objects. Machine learning, working as part of the SLAM system, will allow a robot to learn to distinguish the walls and floors of a room from the furniture and other objects within it. Storing these elements as individual objects means that adding or removing a chair will not necessitate the complete redrawing of the map.

This combination of benefits is the key to massive advances in the capability of autonomous robots. Robots do not generalize well in untrained situations; changes, particularly rapid movement, disrupt maps and add significant computational load. Machine learning creates a layer of abstraction that improves the stability of maps. The greater efficiency it allows in processing data creates the overhead to add more sensors and more data that can increase the granularity and information that can be included in maps.

Machine learning can be used to teach autonomous robots which elements of a scene to use for location, and which to disregard as parts of the map or classify them as obstacles to avoid.

Natural InteractionLinking location, mapping and perception will allow robots to understand more about their surroundings and operate in more useful ways. For example, a robot that can perceive the difference between a hall and a kitchen can undertake more complex sets of instructions. Being able to identify and categorize objects such as chairs, desks, cabinets etc will improve this still further. Instructing a robot to go to a specific room to get a specific thing will become much simpler.

The real revolution in robotics will come when robots start interacting more with people in more natural ways. Robots that learn from multiple situations and combine that knowledge into a model that allows them to take on new, un-trained tasks based on maps and objects preserved in memory. Creating those models and abstraction demands complete integration of all three layers of SLAM. Thanks to the efforts of the those who are leading the industry in these areas, I believe that the Age of Perception is just around the corner.

Editors Note: Robotics Business Review would like to thank SLAMcore for permission to reprint the original article (found HERE).

Continued here:
SLAM + Machine Learning Ushers in the "Age of Perception - Robotics Business Review

Googles new ML Kit SDK keeps all machine learning on the device – SlashGear

Smartphones today have become so powerful that sometimes even mid-range handsets can support some fancy machine learning and AI applications. Most of those, however, still rely on cloud-hosted neural networks, machine learning models, and processing, which has both privacy and efficiency drawbacks. Contrary to what most would expect, Google has been moving to offload much of that machine learning activity from the cloud to the device and its latest machine learning development tool is its latest step in that direction.

Googles machine learning or ML Kit SDK has been around for two years now but it has largely been tied to its Firebase mobile and web development platform. Like many Google products, this creates a dependency on a cloud-platform that entails not just some latency due to network bandwidth but also risks leaking potentially private data in transit.

While Google is still leaving that ML Kit + Firebase combo available, it is now also launching a standalone software development kit or SDK for both Android and iOS app developers that focuses on on-device machine learning. Since everything happens locally, the users privacy is protected and the app can function almost in real-time regardless of the speed of the Internet connection. In fact, an ML-using app can even work offline for that matter.

The implications of this new SDK can be quite significant but it still depends on developers switching from the Firebase version to the standalone SDK. To give them a hand, Google created a code lab that combines the new ML Kit with its CameraX app in order to translate text in real-time without connecting to the Internet.

This can definitely help boost confidence in AI-based apps if the user no longer has to worry about privacy or network problems. Of course, Google would probably prefer that developers keep using the Firebase connection which it even describes as getting the best of both products.

Visit link:
Googles new ML Kit SDK keeps all machine learning on the device - SlashGear

AI and Machine Learning Are Changing Everything. Here’s How You Can Get In On The Fun – ExtremeTech

This site may earn affiliate commissions from the links on this page. Terms of use.

There isnt a new story every week about an interesting new application of artificial intelligence and machine learning happening out there somewhere. There are actually at least five of those stories. Maybe 10. Sometimes, even more.

Like how UK officials are using AI tospot invasive plant species and stop thembefore they cause expensive damage to roads. Or how artificial intelligence is playing a key role inthe fight against COVID-19. Or even in the ultimate in mind-bending Black Mirror-type ideas, how AI is actually being used to help tobuild and manageother AIs.

Scariness aside, the power of artificial intelligence and machine learning to revolutionize the planet is taking hold in virtually every industry imaginable. With implications like that, it isnt hard to understand how a computer science type trained in AI practices can become a key member of any business witha paycheck to match.

The skills to get into this exploding field can be had in training likeThe Ultimate Artificial Intelligence Scientist Certification Bundle ($34.99, over 90 percent off).

The collection features four courses and almost 80 hours of content, introducing interested students to the skills, tools and processes needed to not only understand AI, but apply that knowledge to any given field. With nearly 200,000 positive reviews offered from more than a million students who have taken the courses, its clear why these Super Data Science-taught training sessions attract so many followers.

The coursework begins at the heart of AI and machine learning with thePython A-Zcourse.

The language most prominently linked to the development of such techniques, students follow step-by-step tutorials to understand how Python coding works, then apply that training to actual real-world exercises. Even learners who had never delved into AIs inner workers said the course made them fascinated to learn more in data science.

With the basic underpinnings in hand, students move toMachine Learning A-Z, where more advanced theories and algorithms take on practical shape with a true users guide to crafting your own thinking computers. Students get a true feel for machine learning from professional data scientists, who help even complex ideas like dimensionality reduction become relatable.

InDeep Learning A-Z, large data sets work hand-in-hand with programming fundamentals to help students unlock AI principles in some exciting projects. Students work with artificial neural networks and put them into practice to see how machines can actually think for themselves.

Finally,Tensorflow 2.0: A Complete Guide on the Brand New Tensorflowtakes a closer look at Tensorflow, one of the most powerful tools AI experts use to craft working networks. Actual Tensorflow exercises will explain how to build models and construct large-scale neural networks so machines can understand all the information theyre processing, then use that data to define their own solutions to problems.

Regularly priced at $200 per course, you can pick up all four courses now forjust $34.99.

Note: Terms and conditions apply. See the relevant retail sites for more information. For more great deals, go to our partners atTechBargains.com.

Now read:

Go here to see the original:
AI and Machine Learning Are Changing Everything. Here's How You Can Get In On The Fun - ExtremeTech

Discovery of aggressive cancer cell types by Vanderbilt researchers made possible with machine learning techniques – Vanderbilt University News

By applying unsupervised and automated machine learning techniques to the analysis of millions of cancer cells, Rebecca Ihrie and Jonathan Irish, both associate professors of cell and developmental biology, have identified new cancer cell types in brain tumors. Machine learning is a series of computer algorithms that can identify patterns within enormous quantities of data and get smarter with more experience. This finding holds the promise of enabling researchers to better understand and target these cell types for research and therapeutics for glioblastoma an aggressive brain tumor with high mortality as well as the broader applicability of machine learning to cancer research.

With their collaborators, Ihrie and Irish developed Risk Assessment Population IDentification (RAPID), an open-source machine learning algorithm that revealed coordinated patterns of protein expression and modification associated with survival outcomes.

The article, Unsupervised machine learning reveals risk stratifying glioblastoma tumor cells was published online in the journal eLife on June 23. RAPID code and examples are available on the cytolab Github page.

For the past decade, the research community has been working to leverage machine learnings ability to absorb and analyze more data for cancer cell research than the human mind alone can process. Without any human oversight, RAPID combed through 2 million tumor cells with at least 4,710 glioblastoma cells from each patient from 28 glioblastomas, flagging the most unusual cells and patterns for us to look into, said Ihrie. Were able to find the needles in the haystack without searching the entire haystack. This technology lets us devote our attention to better understanding the most dangerous cancer cells and to get closer to ultimately curing brain cancer.

Fed into RAPID were data on cellular proteins that govern the identity and function of neural stem cells and other brain cells. The data type used is called single-cell mass cytometry, a measurement technique typically applied to blood cancer. Once RAPIDs statistical analysis was complete and the needles in the haystack were found, only those cells were studied. One of the most exciting results of our research is that unsupervised machine learning found the worst offender cells without needing the researchers to give it clinical or biological knowledge as context, said Irish, also scientific director of Vanderbilts Cancer & Immunology Core. The findings of this study currently represent the biggest biology advance from my lab at Vanderbilt.

The researchers machine learning analysis enabled their team to study multiple characteristics of the proteins in brain tumor cells in relation to other characteristics, delivering new and unexpected patterns. The collaboration between our two labs, the support that we received for this high-risk work from Vanderbilt and the Vanderbilt-Ingram Cancer Center (VICC) and the fruitful collaboration with neurosurgeons and pathologists who provided a unique opportunity to study human cells right out of the brain allowed us to achieve this milestone, said Ihrie and Irish in a joint statement. The co-first authors of the paper are former Vanderbilt graduate students Nalin Leelatian, a current neuropathology resident at Yale (Irish lab), and Justine Sinnaeve (Ihrie lab). Through her research and work on this topic, Leelatian earned the American Brain Tumor Association (ABTA) Scholar-in-Training Award, American Association for Cancer Research (AACR) in April 2017.

The applicability of this research extends beyond cancer research to data analysis techniques for broader human disease research and laboratory modeling of diseases using multiple samples. The paper also demonstrates that these complex patterns, once found, can be used to develop simpler classifications that can be applied to hundreds of samples. Researchers studying glioblastoma brain tumors will be able to refer to these findings as they test to see if their own samples are comparable to the cell and protein expression patterns discovered by Ihrie, Irish, and collaborators.

This work was supported by the Michael David Greene Brain Cancer Fund, a discovery grant for brain tumor research established in 2004. The grant was recently renewed for another five years to support Ihrie and Irishs continued research on glioblastoma. Additional support was provided by the National Institutes of Health, including the National Cancer Institute and National Institute of Neurological Disorders and Stroke, VICC and VICC Ambassadors, the Vanderbilt International Scholars program, a Vanderbilt University Discovery Grant, an Alpha Omega Alpha Postgraduate Award, a Society of Neurological Surgeons/RUNN Award, a Burroughs Wellcome Fund Physician-Scientist Institutional Award, the Vanderbilt Institute for Clinical and Translational Research, and the Southeastern Brain Tumor Foundation.

See the original post:
Discovery of aggressive cancer cell types by Vanderbilt researchers made possible with machine learning techniques - Vanderbilt University News