Archive for the ‘Machine Learning’ Category

Orange isn’t building its own AI foundation model here’s why – Light Reading

There has been a flurry of interest in generative AI (GenAI) from telcos, each of which has taken its own nuanced approach to the idea of building its own large language models (LLMs). While Vodafone seems todismiss the ideaand Verizon appears content to build on existing foundation models, Deutsche Telekom and SK Telecomannounced last yearthey will develop telco-specific LLMs. Orange, meanwhile, doesn't currently see the need to build a foundation model, its chief AI officer Steve Jarrett has recently told Light Reading.

Jarrett said the company is currently content with using existing models and adapting them to its needs using two main approaches. The first one is retrieval-augmented generation (RAG), where a detailed source of information is passed to the model together with the prompt to augment its response.

He said this allows the company to experiment with different prompts easily, adding that existing methodologies can be used to assess the results. "That is a very, very easy way to dynamically test different models, different styles of structuring the RAG and the prompts. And [] that solves the majority of our needs today," he elaborated.

At the same time, Jarrett admitted that the downside of RAG is that it may require a lot of data to be passed along with the prompt, making more complex tasks slow and expensive. In such cases, he argued, fine-tuning is a more appropriate approach.

Distilling models

In this case, he explained, "you take the information that you would have used in the RAG for [] a huge problem area. And you make a new version of the underlying model that embeds all that information." Another related option is to distill the model.

This involves not just structuring the output of the model, but downsizing it, "like you're distilling fruit into alcohol," Jarrett said, adding "there are techniques to actually chop the model down into a much smaller model that runs much faster."

This approach is, however, highly challenging. "Even my most expert people frequently make mistakes," he admitted, saying: "It's not simple, and the state of the art of the tools to fine tune are changing every single day." At the same time, he noted that these tools are improving constantly and, as a result, he expects fine-tuning to get easier over time.

He pointed out that building a foundation model from scratch would be an even more complex task, which the company currently doesn't see a reason to embark on. Nevertheless, he stressed that it's impossible to predict how things will evolve in the future.

Complexity budget

One possibility is that big foundational models will eventually absorb so much information that the need for RAG and other tools will diminish. In this scenario, Orange may never have to create its own foundation model, Jarrett said, "as long as we have the ability to distill and fine tune models, where we need to, to make the model small enough to run faster and cheaper and so on."

He added: "I think it's a very open question in the industry. In the end, will we have a handful of massive models, and everyone's doing 99% RAG and prompt engineering, or are there going to be millions of distilled and fine-tuned models?"

One factor that may determine where things will go in the future is what Jarrett calls the complexity budget. This is a concept that conveys how much computing was needed from start to finish to produce an answer.

While a very large model may be more intensive to train in the beginning, there may be less computing required for RAG and fine-tuning. "The other approach is you have a large language model that also obviously took a lot of training, but then you do a ton more compute to fine tune and distill the model so that your model is much smaller," he added.

Apart from cost, there is also an environmental concern. While hyperscalers tend to perform relatively well in terms of using clean energy, and Jarrett claimed that Orange is "fairly green as a company," he added that the carbon intensity of the energy used for on-premises GPU clusters tends to vary in the industry.

Right tool for the job

The uncertainty surrounding GenAI's future evolution is one of the reasons why Orange is taking a measured approach to the technology, with Jarrett stressing it is not a tool that's suited to every job. "You don't want to use the large language model sledge hammer to hit every nail," he said.

"I think, fairly uniquely compared to most other telco operators, we actually have the ability, the skill inside of Orange to help make these decisions about what tool to use when. So we prefer to use a statistical method or basic machine learning to solve problems because those results are more [] explainable. They're usually cheaper, and they're usually less impactful on the environment," he added.

In fact, Jarrett says one of the things Orange is investigating at the moment is how to use multiple AI models together to solve problems. The notion, he added, is called agents, and refers to a high-level abstraction of a problem, such as asking how the network in France is working on a given day. This, he said, will enable the company to solve complex problems more dynamically.

In the meantime, the company is making a range of GenAI models available to its employees, including ChatGPT, Dolly and Mistral. To do so, it has built a solution that Jarrett says provides a "secure, European-resident version of leading AI models that we make available to the entire company."

Improving customer service

Jarrett says this is a more controlled and safer way for employees to use models than if they were accessed directly. The solution also notifies the employee of the cost of running a specific model to answer a question. Available for several months, it has so far been used by 12% of employees.

Orange has already deployed GenAI in many countries within its customer service solutions to predict what the most appealing offer may be to an individual customer, Jarrett said, adding "what we're trialling right now is can generative AI help us to customize and personalize the text of that offer? Does that make the offer incrementally more appealing?"

Another potential use case is in transcribing a conversation with a customer care agent in real time, using generative AI to create prompts. The tool is still in development but could help new recruits to improve faster, raising employee and customer satisfaction, said Jarrett.

While Orange doesn't currently use GenAI for any use cases in the network, some are under development, although few details are being shared at this stage. One use case involves predicting when batteries at cell sites may need replacing.

Jarrett admits, however, that GenAI is still facing a number of challenges, such as hallucinations. "In a scenario where the outputs have to be correct 100% of the time, we're not going to use generative AI for that today, because [it's] not correct 100% of the time," he said.

Dealing with hallucinations

Yet it can be applied in areas that are less sensitive. "For example, if for internal use you want to have a summary of an enormous transcript of a long meeting that you missed, it's okay if the model hallucinates a little bit," he added.

Hallucinations cannot be stopped entirely and will likely continue to be a problem for some time, said Jarrett. But he believes RAG and fine-tuning could mitigate the issue to some extent.

"The majority of the time, if we're good at prompt engineering and we're good at passing the appropriate information with the response, the model generates very, very useful, relevant answers," Jarrett said about the results achieved with RAG.

The availability and quality of data is another issue that is often discussed, and also one that Orange is trying to address. Using data historically kept in separate silos has been difficult, said Jarrett. "[The] availability of the data from the marketing team to be able to run a campaign on where was our network relatively strong, for example those use cases were either impossible, or took many, many, many months of manual meetings and collaboration."

As a result, the company is trying to create a marketplace where data is made widely available inside each country and appropriately labeled. Orange calls this approach data democracy.

Continued here:
Orange isn't building its own AI foundation model here's why - Light Reading

Wall Street’s Favorite Machine Learning Stocks? 3 Names That Could Make You Filthy Rich – InvestorPlace

Machine learning stocks receive a lot of love in 2024

Source: a-image / Shutterstock.com

United States equities are on the rise again in 2024. TheS&P 500and Nasdaq have appreciated 7.2% and 7.4%, respectively. While stocks may be back on the rise, equities investors may want to reconsider putting money in innovative companies. Given the traction AI-related technology companies got last year, machine learning stocks may also receive a lot of love in 2024.

Machine learning (ML)is a branch of artificial intelligence (AI) that enables computers to learn from data and experience without explicit programming. Over the past decade, the technology has also garnered attention for its numerous applications. ML has also received positive attention from Wall Street. Below are three machine learning stocks that could make investors rich in the long-term.

Source: rafapress / Shutterstock.com

UiPath(NYSE:PATH) creates and implements software allowing customers to automate various business processes using robotic process automation (RPA) and artificial intelligence.

TheUiPath Business Automation Platformenables employees to quickly build automations for both existing and new processes by using software robots to perform a myriad of repetitive tasks. These range from simply logging into applications or moving folders to extracting information from documents along with updating information fields and databases. UiPath also provides a number ofturnkey automation solutions, allowing the company to target customers in a variety of industries including banking, healthcare and manufacturing.

Last year, shares of PATH almost doubled. Since the start of the new year, there has been pullback from all the major indices and, of course, UiPath, at its frothy valuation, saw some selling pressure. The companys share price has fallen 7% YTD. Selling pressure has continued slightly after weaker-than-expected guidance in UiPaths Q4 2023 earnings report. Outside of guidance, the company beat both revenue and earnings estimates. Q4 revenue increased 31% YOY to $405 million, and annual recurring revenue increased 22% to $1.5 billion. The company also achieved its first quarter of GAAP profitability as a public company in the fourth quarter.

Strong financial figures, despite weaker-than-expected guidance, could make UiPath a strong performer in 2024.

Source: JHVEPhoto / Shutterstock.com

Its hard to make a machine learning list without listing a semiconductor name, since semiconductors help machine learning programs to work the way they do. Advanced Micro Devices (NASDAQ:AMD) has built a number of advanced hardware for gaming and other computing applications. AMDs Radeon GPUs nowadays support RDNA 3 architecture-based GPUs for desktop-level AI and machine learning workflows.

2024 will be a big year for AMD in terms of AI and ML computing. The chipmaker announced the MI300x GPU chipset almost a year ago in its second quarter 2023 earnings report. To follow that up, in the third-quarter earnings report, AMD announced itexpects to sell $2 billion in AI chips next year. Because these AI chips arestillin high demand in North America, Europe and Asia, AMD will likely reap a significant profit upon entering the space.

Wall Street, notably, is loving AMDs stock. Wall Street firms have recently begun to boost their target prices for the chipmaker. The investment bank Jefferiesraisedtheir target price for AMD to $200/share from $130/share. JPMorgan, Goldman Sachs, Baird and a host of other investment banksalso made significant increases to their target pricesin late January 2024. Moreover, Japanese bank Mizuho Securities has recently raised its target price for $200/share to $235/share.

Source: Mamun sheikh K / Shutterstock.com

Last on our list of machine learning stocks is Palantir Technologies(NYSE:PLTR). Palantir has received a lot of love from some on Wall Street and a number of retail investors. Shares have risen 37% YTD. For those who dont know, Palantir initially focused on serving the defense and intelligence sectors but has since expanded its customer base to include various industries such as healthcare, energy and finance. The company provides a number of AI and ML-based data analytics tools for a number of businesses.

Most recently, Palantir has enjoyed a lot of attention due to its new AI Platform (AIP). AIP candeploycommercial and open-source large language models onto internally held data sets and, from there, recommend business processes and actions. Although I think Palantir has become too overvalued based on many believing its a fully-grown AI company when its just in the beginning, the company certainly has the potential to make investors money in the long-term.

On the date of publication, Tyrik Torresdid not have (either directly or indirectly) any positions in the securities mentioned in this article.The opinions expressed in this article are those of the writer, subject to the InvestorPlace.comPublishing Guidelines.

Tyrik Torres has been studying and participating in financial markets since he was in college, and he has particular passion for helping people understand complex systems. His areas of expertise are semiconductor and enterprise software equities. He has work experience in both investing (public and private markets) and investment banking.

Continued here:
Wall Street's Favorite Machine Learning Stocks? 3 Names That Could Make You Filthy Rich - InvestorPlace

18 Cutting-Edge Artificial Intelligence Applications in 2024 – Simplilearn

The function and popularity of Artificial Intelligence are soaring by the day. Artificial Intelligence is the ability of a system or a program to think and learn from experience. AI applications have significantly evolved over the past few years and have found their applications in almost every business sector. This article will help you learn the top Artificial Intelligence applications in the real world.

Here is the list of the top 18 applications of AI (Artificial Intelligence):

Artificial Intelligence technology is used to create recommendation engines through which you can engage better with your customers. These recommendations are made in accordance with their browsing history, preference, and interests. It helps in improving your relationship with your customers and their loyalty towards your brand.

Virtual shopping assistants and chatbots help improve the user experience while shopping online. Natural Language Processing is used to make the conversation sound as human and personal as possible. Moreover, these assistants can have real-time engagement with your customers. Did you know that on amazon.com, soon, customer service could be handled by chatbots?

Credit card frauds and fake reviews are two of the most significant issues that E-Commerce companies deal with. By considering the usage patterns, AI can help reduce the possibility of credit card fraud taking place. Many customers prefer to buy a product or service based on customer reviews. AI can help identify and handle fake reviews.

Although the education sector is the one most influenced by humans, Artificial Intelligence has slowly begun to seep its roots into the education sector as well. Even in the education sector, this slow transition of Artificial Intelligence has helped increase productivity among faculties and helped them concentrate more on students than office or administration work.

Some of these applications in this sector include:

Artificial Intelligence can help educators with non-educational tasks like task-related duties like facilitating and automating personalized messages to students, back-office tasks like grading paperwork, arranging and facilitating parent and guardian interactions, routine issue feedback facilitating, managing enrollment, courses, and HR-related topics.

Digitization of content like video lectures, conferences, and textbook guides can be made using Artificial Intelligence. We can apply different interfaces like animations and learning content through customization for students from different grades.

Artificial Intelligence helps create a rich learning experience by generating and providing audio and video summaries and integral lesson plans.

Without even the direct involvement of the lecturer or the teacher, a student can access extra learning material or assistance through Voice Assistants. Through this, printing costs of temporary handbooks and also provide answers to very common questions easily.

Using top AI technologies, hyper-personalization techniques can be used to monitor students data thoroughly, and habits, lesson plans, reminders, study guides, flash notes, frequency or revision, etc., can be easily generated.

Artificial Intelligence has a lot of influence on our lifestyle. Let us discuss a few of them.

Automobile manufacturing companies like Toyota, Audi, Volvo, and Tesla use machine learning to train computers to think and evolve like humans when it comes to driving in any environment and object detection to avoid accidents.

The email that we use in our day-to-day lives has AI that filters out spam emails sending them to spam or trash folders, letting us see the filtered content only. The popular email provider, Gmail, has managed to reach a filtration capacity of approximately 99.9%.

Our favorite devices like our phones, laptops, and PCs use facial recognition techniques by using face filters to detect and identify in order to provide secure access. Apart from personal usage, facial recognition is a widely used Artificial Intelligence application even in high security-related areas in several industries.

Various platforms that we use in our daily lives like e-commerce, entertainment websites, social media, video sharing platforms, like youtube, etc., all use the recommendation system to get user data and provide customized recommendations to users to increase engagement. This is a very widely used Artificial Intelligence application in almost all industries.

Based on research from MIT, GPS technology can provide users with accurate, timely, and detailed information to improve safety. The technology uses a combination of Convolutional Neural Networks and Graph Neural Networks, which makes lives easier for users by automatically detecting the number of lanes and road types behind obstructions on the roads. AI is heavily used by Uber and many logistics companies to improve operational efficiency, analyze road traffic, and optimize routes.

Robotics is another field where Artificial Intelligence applications are commonly used. Robots powered by AI use real-time updates to sense obstacles in its path and pre-plan its journey instantly.

It can be used for:

Did you know that companies use intelligent software to ease the hiring process?

Artificial Intelligence helps with blind hiring. Using machine learning software, you can examine applications based on specific parameters. AI drive systems can scan job candidates' profiles, and resumes to provide recruiters an understanding of the talent pool they must choose from.

Artificial Intelligence finds diverse applications in the healthcare sector. AI applications are used in healthcare to build sophisticated machines that can detect diseases and identify cancer cells. Artificial Intelligence can help analyze chronic conditions with lab and other medical data to ensure early diagnosis. AI uses the combination of historical data and medical intelligence for the discovery of new drugs.

Artificial Intelligence is used to identify defects and nutrient deficiencies in the soil. This is done using computer vision, robotics, and machine learning applications, AI can analyze where weeds are growing. AI bots can help to harvest crops at a higher volume and faster pace than human laborers.

Another sector where Artificial Intelligence applications have found prominence is the gaming sector. AI can be used to create smart, human-like NPCs to interact with the players.

It can also be used to predict human behavior using which game design and testing can be improved. The Alien Isolation game released in 2014 uses AI to stalk the player throughout the game. The game uses two Artificial Intelligence systems - Director AI that frequently knows your location and the Alien AI, driven by sensors and behaviors that continuously hunt the player.

Find Our Artificial Intelligence Course in Top Cities

Artificial Intelligence is used to build self-driving vehicles. AI can be used along with the vehicles camera, radar, cloud services, GPS, and control signals to operate the vehicle. AI can improve the in-vehicle experience and provide additional systems like emergency braking, blind-spot monitoring, and driver-assist steering.

On Instagram, AI considers your likes and the accounts you follow to determine what posts you are shown on your explore tab.

Artificial Intelligence is also used along with a tool called DeepText. With this tool, Facebook can understand conversations better. It can be used to translate posts from different languages automatically.

AI is used by Twitter for fraud detection, for removing propaganda, and hateful content. Twitter also uses AI to recommend tweets that users might enjoy, based on what type of tweets they engage with.

Artificial Intelligence (AI) applications are popular in the marketing domain as well.

AI chatbots can comprehend natural language and respond to people online who use the "live chat" feature that many organizations provide for customer service. AI chatbots are effective with the use of machine learning and can be integrated in an array of websites and applications. AI chatbots can eventually build a database of answers, in addition to pulling information from an established selection of integrated answers. As AI continues to improve, these chatbots can effectively resolve customer issues, respond to simple inquiries, improve customer service, and provide 24/7 support. All in all, these AI chatbots can help to improve customer satisfaction.

It has been reported that 80% of banks recognize the benefits that AI can provide. Whether its personal finance, corporate finance, or consumer finance, the highly evolved technology that is offered through AI can help to significantly improve a wide range of financial services. For example, customers looking for help regarding wealth management solutions can easily get the information they need through SMS text messaging or online chat, all AI-powered. Artificial Intelligence can also detect changes in transaction patterns and other potential red flags that can signify fraud, which humans can easily miss, and thus saving businesses and individuals from significant loss. Aside from fraud detection and task automation, AI can also better predict and assess loan risks.

If there's one concept that has caught everyone by storm in this beautiful world of technology, it has to be - AI (Artificial Intelligence), without a question. AI or Artificial Intelligence has seen a wide range of applications throughout the years, including healthcare, robotics, eCommerce, and even finance.

Astronomy, on the other hand, is a largely unexplored topic that is just as intriguing and thrilling as the rest. When it comes to astronomy, one of the most difficult problems is analyzing the data. As a result, astronomers are turning to machine learning and Artificial Intelligence (AI) to create new tools. Having said that, consider how Artificial Intelligence has altered astronomy and is meeting the demands of astronomers.

Many people believe that Artificial Intelligence (AI) is the present and future of the technology sector. Many industry leaders employ AI for a variety of purposes, including providing valued services and preparing their companies for the future.

Data security, which is one of the most important assets of any tech-oriented firm, is one of the most prevalent and critical applications of AI. With confidential data ranging from consumer data (such as credit card information) to organizational secrets kept online, data security is vital for any institution to satisfy both legal and operational duties. This work is now as difficult as it is vital, and many businesses deploy AI-based security solutions to keep their data out of the wrong hands.

Because the world is smarter and more connected than ever before, the function of Artificial Intelligence in business is critical today. According to several estimates, cyberattacks will get more tenacious over time, and security teams will need to rely on AI solutions to keep systems and data under control.

A human may not be able to recognize all of the hazards that a business confronts. Every year, hackers launch hundreds of millions of assaults for a variety of reasons. Unknown threats can cause severe network damage. Worse, they can have an impact before you recognize, identify, and prevent them.

As attackers test different tactics ranging from malware assaults to sophisticated malware assaults, contemporary solutions should be used to avoid them. Artificial Intelligence has shown to be one of the most effective security solutions for mapping and preventing unexpected threats from wreaking havoc on a corporation.

AI assists in detecting data overflow in a buffer. When programs consume more data than usual, this is referred to as buffer overflow. Aside from the fault caused by human triggers breaking crucial data. These blunders are also observable by AI, and they are detected in real-time, preventing future dangers.

AI can precisely discover cybersecurity weaknesses, faults, and other problems using Machine Learning. Machine Learning also assists AI in identifying questionable data provided by any application. Malware or virus used by hackers to gain access to systems as well as steal data is carried out via programming language flaws.

Artificial Intelligence technology is constantly being developed by cyber security vendors. In its advanced version, AI is designed to detect flaws in the system or even the update. Itd instantly exclude anybody attempting to exploit those issues. AI would be an outstanding tool for preventing any threat from occurring. It may install additional firewalls as well as rectify code faults that lead to dangers.

It's something that happens after the threat has entered the system. As previously explained, AI is used to detect unusual behavior and create an outline of viruses or malware. AI is currently taking appropriate action against viruses or malware. The reaction consists mostly of removing the infection, repairing the fault, and administering the harm done. Finally, AI guarantees that such an incident does not happen again and takes proper preventative actions.

AI allows us to detect unusual behavior in a system. It is capable of detecting unusual or unusual behavior by continually scanning a system and gathering an appropriate amount of data. In addition, AI identifies illegal access. When unusual behavior is identified, Artificial Intelligence employs particular elements to determine whether it represents a genuine threat or a fabricated warning. Machine Learning is used to help AI determine what is and is not aberrant behavior. Machine Learning is also improving with time, which will allow Artificial Intelligence to detect even minor anomalies. As a result, AI would point to anything wrong with the system.

Intelligent technology has become a part of our daily lives in recent years. And, as technology advances across society, new uses of AI, notably in transportation, are becoming mainstream. This has created a new market for firms and entrepreneurs to develop innovative solutions for making public transportation more comfortable, accessible, and safe.

Intelligent transportation systems have the potential to become one of the most effective methods to improve the quality of life for people all around the world. There are multiple instances of similar systems in use in various sectors.

Truck platooning, which networks HGV (heavy goods vehicles), for example, might be extremely valuable for vehicle transport businesses or for moving other large items.

The lead vehicle in a truck platoon is steered by a human driver, however, the human drivers in any other trucks drive passively, just taking the wheel in exceptionally dangerous or difficult situations.

Because all of the trucks in the platoon are linked via a network, they travel in formation and activate the actions done by the human driver in the lead vehicle at the same time. So, if the lead driver comes to a complete stop, all of the vehicles following him do as well.

Clogged city streets are a key impediment to urban transportation all around the world. Cities throughout the world have enlarged highways, erected bridges, and established other modes of transportation such as train travel, yet the traffic problem persists. However, AI advancements in traffic management provide a genuine promise of changing the situation.

Intelligent traffic management may be used to enforce traffic regulations and promote road safety. For example, Alibaba's City Brain initiative in China uses AI technologies such as predictive analysis, big data analysis, and a visual search engine in order to track road networks in real-time and reduce congestion.

Building a city requires an efficient transformation system, and AI-based traffic management technologies are powering next-generation smart cities.

Platforms like Uber and OLA leverage AI to improve user experiences by connecting riders and drivers, improving user communication and messaging, and optimizing decision-making. For example, Uber has its own proprietary ML-as-a-service platform called Michelangelo that can anticipate supply and demand, identify trip abnormalities like wrecks, and estimate arrival timings.

AI-enabled route planning using predictive analytics may help both businesses and people. Ride-sharing services already achieve this by analyzing numerous real-world parameters to optimize route planning.

AI-enabled route planning is a terrific approach for businesses, particularly logistics and shipping industries, to construct a more efficient supply network by anticipating road conditions and optimizing vehicle routes. Predictive analytics in route planning is the intelligent evaluation by a machine of a number of road usage parameters such as congestion level, road restrictions, traffic patterns, consumer preferences, and so on.

Cargo logistics companies, such as vehicle transport services or other general logistics firms, may use this technology to reduce delivery costs, accelerate delivery times, and better manage assets and operations.

A century ago, the idea of machines being able to comprehend, do complex computations, and devise efficient answers to pressing issues was more of a science fiction writer's vision than a predictive reality. Still, as we enter the third decade of the twenty-first century, we can't fathom our lives without stock trading and marketing bots, manufacturing robots, smart assistance, virtual travel agents, and other innovations made possible by advances in Artificial Intelligence and machine learning. The importance of Artificial Intelligence and machine learning in the automotive sector cannot be overstated.

With Artificial Intelligence driving more applications to the automotive sector, more businesses are deciding to implement Artificial Intelligence and machine learning models in production.

Infusing AI into the production experience allows automakers to benefit from smarter factories, boosting productivity and lowering costs. AI may be utilized in automobile assembly, supply chain optimization, employing robots on the manufacturing floor, improving performance using sensors, designing cars, and in post-production activities.

The automobile sector has been beset by supply chain interruptions and challenges in 2021 and 2022. AI can also assist in this regard. AI helps firms identify the hurdles they will face in the future by forecasting and replenishing supply chains as needed. AI may also assist with routing difficulties, volume forecasts, and other concerns.

We all wish to have a pleasant journey in our vehicles. Artificial Intelligence can also help with this. When driving, Artificial Intelligence (AI) may assist drivers in remaining focused by decreasing distractions, analyzing driving behaviors, and enhancing the entire customer experience. Passengers can benefit from customized accessibility as well as in-car delivery services thanks to AI.

The procedure of inspecting an automobile by a rental agency, insurance provider, or even a garage is very subjective and manual. With AI, car inspection may go digital, with modern technology being able to analyze a vehicle, identify where the flaws are, and produce a thorough status report.

Everyone desires a premium vehicle and experience. Wouldn't you prefer to know if something is wrong with your automobile before it breaks down? In this application, AI enables extremely accurate predictive monitoring, fracture detection, and other functions.

Read the original:
18 Cutting-Edge Artificial Intelligence Applications in 2024 - Simplilearn

Machine-learning-based global optimization of microwave passives with variable-fidelity EM models and response … – Nature.com

Zhu, F., Luo, G. Q., Liao, Z., Dai, X. W. & Wu, K. Compact dual-mode bandpass filters based on half-mode substrate-integrated waveguide cavities. IEEE Microw. Wirel. Compon. Lett. 31(5), 441444 (2021).

Article ADS Google Scholar

Erman, F., Koziel, S., Hanafi, E., Soboh, R. & Szczepanski, S. Miniaturized metal-mountable U-shaped inductive-coupling-fed UHF RFID tag antenna with defected microstrip surface. IEEE Access 10, 4730147308 (2022).

Article Google Scholar

Zhang, H. et al. A low-profile compact dual-band L-shape monopole antenna for microwave thorax monitoring. IEEE Ant. Wirel. Propag. Lett. 19(3), 448452 (2020).

Article ADS Google Scholar

Matos, D., da Cruz Jordo, M. D., Correia, R. & Carvalho, N. B. Millimeter-wave BiCMOS backscatter modulator for 5 G-IoT applications. IEEE Microw. Wirel. Compon. Lett. 31(2), 173176 (2021).

Article Google Scholar

Hu, Y.-Y., Sun, S. & Xu, H. Compact collinear quasi-Yagi antenna array for wireless energy harvesting. IEEE Access 8, 3530835317 (2020).

Article Google Scholar

Li, Q., Chen, X., Chi, P. & Yang, T. Tunable bandstop filter using distributed coupling microstrip resonators with capacitive terminal. IEEE Microw. Wirel. Compon. Lett. 30(1), 3538 (2020).

Article Google Scholar

Liu, M. & Lin, F. Two-section broadband couplers with wide-range phase differences and power-dividing ratios. IEEE Microw. Wirel. Compon. Lett. 31(2), 117120 (2021).

Article ADS Google Scholar

Gmez-Garca, R., Rosario-De Jesus, J. & Psychogiou, D. Multi-band bandpass and bandstop RF filtering couplers with dynamically-controlled bands. IEEE Access 6, 3232132327 (2018).

Article Google Scholar

Zhang, R. & Peroulis, D. Mixed lumped and distributed circuits in wideband bandpass filter application for spurious-response suppression. IEEE Microw. Wirel. Compon. Lett. 28(11), 978980 (2018).

Article Google Scholar

He, Z. & Liu, C. A compact high-efficiency broadband rectifier with a wide dynamic range of input power for energy harvesting. IEEE Microw. Wirel. Compon. Lett. 30(4), 433436 (2020).

Article Google Scholar

Jiang, Z. H., Gregory, M. D. & Werner, D. H. Design and experimental investigation of a compact circularly polarized integrated filtering antenna for wearable biotelemetric devices. IEEE Trans. Biomedical Circuits Syst. 10(2), 328338 (2016).

Article Google Scholar

Kracek, J., vanda, M., Mazanek, M. & Machac, J. Implantable semi-active UHF RFID tag with inductive wireless power transfer. IEEE Ant. Wirel. Propag. Lett. 15, 16571660 (2016).

Article ADS Google Scholar

Firmansyah, T., Alaydrus, M., Wahyu, Y., Rahardjo, E. T. & Wibisono, G. A highly independent multiband bandpass filter using a multi-coupled line stub-SIR with folding structure. IEEE Access 8, 8300983026 (2020).

Article Google Scholar

Chen, S. et al. A frequency synthesizer based microwave permittivity sensor using CMRC structure. IEEE Access 6, 85568563 (2018).

Article Google Scholar

Zhu, Y., Wang, J., Hong, J., Chen, J.-X. & Wu, W. Two- and three-way filtering power dividers with harmonic suppression using triangle patch resonator. IEEE Trans. Circuits Syst. I Regul. Pap. 68(12), 50075017 (2021).

Article Google Scholar

Wei, F., Jay Guo, Y., Qin, P. & Wei Shi, X. Compact balanced dual- and tri-band bandpass filters based on stub loaded resonators. IEEE Microw. Wirel. Compon. Lett. 25(2), 7678 (2015).

Article Google Scholar

Koziel, S. & Bandler, J. W. Space mapping with multiple coarse models for optimization of microwave components. IEEE Microw. Wirel. Compon. Lett. 18, 13 (2008).

Article Google Scholar

Chione, G. & Pirola, M. Microwave Electronics (Cambridge University Press, 2018).

Google Scholar

Ullah, U., Al-Hasan, M., Koziel, S. & Ben Mabrouk, I. Series-slot-fed circularly polarized multiple-input-multiple-output antenna array enabling circular polarization diversity for 5G 28-GHz indoor applications. IEEE Trans. Ant. Prop. 69(9), 56075616 (2021).

Article ADS Google Scholar

Zhu, Y. & Dong, Y. A novel compact wide-stopband filter with hybrid structure by combining SIW and microstrip technologies. IEEE Microw. Wirel. Compon. Lett. 31(7), 841844 (2021).

Article Google Scholar

Koziel, S., Pietrenko-Dabrowska, A. & Plotka, P. Reduced-cost microwave design closure by multi-resolution EM simulations and knowledge-based model management. IEEE Access 9, 116326116337 (2021).

Article Google Scholar

Feng, F. et al. Parallel gradient-based EM optimization for microwave components using adjoint- sensitivity-based neuro-transfer function surrogate. IEEE Trans. Microw. Theory Techn. 68(9), 36063620 (2020).

Article ADS Google Scholar

Kolda, T. G., Lewis, R. M. & Torczon, V. Optimization by direct search: new perspectives on some classical and modern methods. SIAM Rev. 45, 385482 (2003).

Article ADS MathSciNet Google Scholar

Shen, Z., Xu, K., Mbongo, G. M., Shi, J. & Yang, Y. Compact balanced substrate integrated waveguide filter with low insertion loss. IEEE Access 7, 126111126115 (2019).

Article Google Scholar

Li, Y., Ren, P. & Xiang, Z. A dual-passband frequency selective surface for 5G communication. IEEE Antennas Wirel. Propag. Lett. 18(12), 25972601 (2019).

Article ADS Google Scholar

Abdullah, M. & Koziel, S. Supervised-learning-based development of multi-bit RCS-reduced coding metasurfaces. IEEE Trans. Microw. Theory Tech. 70(1), 264274 (2021).

Article ADS Google Scholar

Blankrot, B. & Heitzinger, C. Efficient computational design and optimization of dielectric metamaterial structures. IEEE J. Multiscale Multiphysics Comp. Tech. 4, 234244 (2019).

Article ADS Google Scholar

Ma, Y., Yang, S., Chen, Y., Qu, S.-W. & Hu, J. Pattern synthesis of 4-D irregular antenna arrays based on maximum-entropy model. IEEE Trans. Antennas Propag. 67(5), 30483057 (2019).

Article ADS Google Scholar

Tang, M., Chen, X., Li, M. & Ziolkowski, R. W. Particle swarm optimized, 3-D-printed, wideband, compact hemispherical antenna. IEEE Antennas Wirel. Propag. Lett. 17(11), 20312035 (2018).

Article ADS Google Scholar

Li, H., Jiang, Y., Ding, Y., Tan, J. & Zhou, J. Low-sidelobe pattern synthesis for sparse conformal arrays based on PSO-SOCP optimization. IEEE Access 6, 7742977439 (2018).

Article Google Scholar

Zhang, H., Bai, B., Zheng, J. & Zhou, Y. Optimal design of sparse array for ultrasonic total focusing method by binary particle swarm optimization. IEEE Access 8, 111945111953 (2020).

Article Google Scholar

Rayas-Sanchez, J. E., Koziel, S. & Bandler, J. W. Advanced RF and microwave design optimization: A journey and a vision of future trends. IEEE J. Microw. 1(1), 481493 (2021).

Article Google Scholar

Abdullah, M. & Koziel, S. A novel versatile decoupling structure and expedited inverse-model-based re-design procedure for compact single-and dual-band MIMO antennas. IEEE Access 9, 3765637667 (2021).

Article Google Scholar

Jin, H., Zhou, Y., Huang, Y. M., Ding, S. & Wu, K. Miniaturized broadband coupler made of slow-wave half-mode substrate integrated waveguide. IEEE Microw. Wirel. Compon. Lett. 27(2), 132134 (2017).

Article Google Scholar

Martinez, L., Belenguer, A., Boria, V. E. & Borja, A. L. Compact folded bandpass filter in empty substrate integrated coaxial line at S-Band. IEEE Microw. Wirel. Compon. Lett. 29(5), 315317 (2019).

Article Google Scholar

Shum, K. M., Luk, W. T., Chan, C. H. & Xue, Q. A UWB bandpass filter with two transmission zeros using a single stub with CMRC. IEEE Microw. Wirel. Compon. Lett. 17(1), 4345 (2007).

Article ADS Google Scholar

Li, X. & Luk, K. M. The grey wolf optimizer and its applications in electromagnetics. IEEE Trans. Antennas Propag. 68(3), 21862197 (2020).

Article ADS Google Scholar

Luo, X., Yang, B. & Qian, H. J. Adaptive synthesis for resonator-coupled filters based on particle swarm optimization. IEEE Trans. Microw. Theory Tech. 67(2), 712725 (2019).

Article ADS Google Scholar

Majumder, A., Chatterjee, S., Chatterjee, S., Sinha Chaudhari, S. & Poddar, D. R. Optimization of small-signal model of GaN HEMT by using evolutionary algorithms. IEEE Microw. Wirel. Compon. Lett. 27(4), 362364 (2017).

Article Google Scholar

Oyelade, O. N., Ezugwu, A.E.-S., Mohamed, T. I. A. & Abualigah, L. Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access 10, 1615016177 (2022).

Article Google Scholar

Milner, S., Davis, C., Zhang, H. & Llorca, J. Nature-inspired self-organization, control, and optimization in heterogeneous wireless networks. IEEE Trans. Mobile Comput. 11(7), 12071222 (2012).

Article Google Scholar

Zhao, Q. & Li, C. Two-stage multi-swarm particle swarm optimizer for unconstrained and constrained global optimization. IEEE Access 8, 124905124927 (2020).

Article Google Scholar

Jiacheng, L. & Lei, L. A hybrid genetic algorithm based on information entropy and game theory. IEEE Access 8, 3660236611 (2020).

Article Google Scholar

Zhao, Z., Wang, X., Wu, C. & Lei, L. Hunting optimization: A new framework for single objective optimization problems. IEEE Access 7, 3130531320 (2019).

Article Google Scholar

Zhang, Q. & Liu, L. Whale optimization algorithm based on Lamarckian learning for global optimization problems. IEEE Access 7, 3664236666 (2019).

Article Google Scholar

Ismaeel, A. A. K., Elshaarawy, I. A., Houssein, E. H., Ismail, F. H. & Hassanien, A. E. Enhanced elephant herding optimization for global optimization. IEEE Access 7, 3473834752 (2019).

Article Google Scholar

Wang, P., Rao, Y. & Luo, Q. An effective discrete grey wolf optimization algorithm for solving the packing problem. IEEE Access 8, 115559115571 (2020).

Article Google Scholar

Liu, F., Liu, Y., Han, F., Ban, Y. & Jay Guo, Y. Synthesis of large unequally spaced planar arrays utilizing differential evolution with new encoding mechanism and cauchy mutation. IEEE Trans. Antennas Propag. 68(6), 44064416 (2020).

Article ADS Google Scholar

Kovaleva, M., Bulger, D. & Esselle, K. P. Comparative study of optimization algorithms on the design of broadband antennas. IEEE J. Multiscale Multiphysics Comput. Tech. 5, 8998 (2020).

Article ADS Google Scholar

Ghorbaninejad, H. & Heydarian, R. New design of waveguide directional coupler using genetic algorithm. IEEE Microw. Wirel. Compon. Lett. 26(2), 8688 (2016).

Article Google Scholar

More here:
Machine-learning-based global optimization of microwave passives with variable-fidelity EM models and response ... - Nature.com

Benchmarking machine learning and parametric methods for genomic prediction of feed efficiency-related traits in … – Nature.com

The FE-related traits and genomic information were obtained for 1,156 animals from an experimental breeding program at the Beef Cattle Research Center (Institute of Animal Science IZ).

Animals were from an experimental breeding program at the Beef Cattle Research Center at the Institute of Animal Science (IZ) in Sertozinho, So Paulo, Brazil. Since the 1980s, the experimental station has maintained three selection herds: Nellore control (NeC) with animals selected for yearling body weight (YBW) with a selection differential close to zero, within birth year and herd, while Nellore Selection (NeS) and Nellore Traditional (NeT) animals are selected for the YBW with a maximum selection differential, also within birth year and herd25. In the NeT herd, sires from commercial herds or NeS eventually were used in the breeding season, while the NeC and NeS were closed herds (only sires from the same herd were used in the breeding season), with controlled inbreeding rate by planned matings. In addition, the NeT herd has been selected for lower residual feed intake (RFI) since 2013. In the three herds, the animal selection is based on YBW measured at 378days of age in young bulls.

The FE-related traits were evaluated on 1156 animals born between 2004 and 2015 in a feeding efficiency trial, in which they were either housed in individual pens (683 animals) or group pens equipped with the GrowSafe feeding system (473 animals), with animals grouped by sex. From those, 146 animals were from the NeC herd (104 young bulls and 42 heifers), 300 from the NeS herd (214 young bulls and 86 heifers), and 710 from the NeT herd (483 young bulls and 227 heifers). Both feeding trials comprised at least 21 days for adaptation to the feedlot diet and management and at least 56 days for the data collection period. The young bull and heifers showed an average age at the end of the feeding trial was 36627.5 and 38445.4 days, respectively.

A total of 780 animals were genotyped with the Illumina BovineHD BeadChip assay (770k, Illumina Inc., San Diego, CA, USA), while 376 animals were genotyped with the GeneSeek Genomic Profiler (GGP Indicus HD, 77K). The animals genotyped with the GGP chip were imputed to the HD panel using FImpute v.326 with an expected accuracy higher than 0.97. Autosomal SNP markers with a minor allele frequency (MAF) lower than 0.10 and a significant deviation from HardyWeinberg equilibrium (P105) were removed, and markers and samples with call rate lower than 0.95 were also removed. An MAF lower than 10% was used to remove genetic markers with lower significance and noise information in a stratified population. After this quality control procedure, genotypes from 1,024 animals and 305,128 SNP markers remained for GS analyses. Population substructure was evaluated using a principal component analysis (PCA) based on the genomic relationship matrix using the ade4 R package (Supplementary Figure S1)27.

Animals were weighed without fasting at the beginning and end of the feeding trial, as well as every 14 days during the experimental period. The mixed ration (dry corn grain, corn silage, soybean, urea, and mineral salt) was offered ad libitum and formulated with 67% of total digestible nutrients (TDN) and 13% of crude protein (CP), aiming for an average daily gain (ADG) of 1.1kg.

The following feed efficiency-related traits were evaluated: ADG, dry matter intake (DMI), feed efficiency (FE), and RFI. In the individual pens, the orts were weighed daily in the morning before the feed delivery to calculate the daily dietary intake. In the group pens, the GrowSafe feeding system automatically recorded the feed intake. Thus, the DMI (expressed as kg/day) was estimated as the feed intake by each animal with subsequent adjustments for dry matter content. ADG was estimated as the slope of the linear regression of body weight (BW) on feeding trial days, and the FE was expressed as the ratio of ADG and DMI. Finally, RFI was calculated within each contemporary group (CG), as the difference between the observed and expected feed intake considering the average metabolic body weight (MBW) and ADG of each animal (Koch et al., 1963) as follows:

$$DMI=CG+ {beta }_{0}+{beta }_{1}ADG+{beta }_{2}MBW+varepsilon$$

where ({beta }_{0}) is the model intercept, ({beta }_{1}) and ({beta }_{2}) are the linear regression coefficients for (ADG) and ({MBW=BW}^{0.75}), respectively, and (varepsilon) is the residual of the equation representing the RFI estimate.

The contemporary groups (CG) were defined by sex, year of birth, type of feed trial pen (individual or collective) and selection herd. Phenotypic observations with values outside the interval of3.5 standard deviations below and above the mean of each CG for each trait were excluded, and the number of animals per CG ranged from 10 to 70.

The (co)variance components and heritability for FE-related traits were estimated considering a multi-trait GBLUP (MTGBLUP) as follows:

$$mathbf{y}=mathbf{X}{varvec{upbeta}}+mathbf{Z}mathbf{a}+mathbf{e},$$

Where ({varvec{y}}) is the matrix of phenotypic FE-related traits (ADG, FE, DMI, and RFI) of dimension Nx4 (N individuals andfour traits); ({varvec{upbeta}}) is the vector of fixed effects, linear and quadratic effects of cow age, and linear effect of animals age at the beginning of the test; (mathbf{a}) is the vector of additive genetic effects (breeding values) of animal, and (mathbf{e}) is a vector with the residual terms. The (mathbf{X}) and (mathbf{Z}) are the incidence matrices related to fixed (b) and random effects (a), respectively. It was assumed that the random effects of animals and residuals were normally distributed, as (mathbf{a}sim {text{N}}(0,mathbf{G}otimes {mathbf{S}}_{mathbf{a}})) and (mathbf{e}sim {text{N}}(0,mathbf{I}otimes {mathbf{S}}_{mathbf{e}})), where (mathbf{G}) is the additive genomic relationship matrix between genotyped individuals according to VanRaden28, (mathbf{I}) is an identity matrix,is the Kronecker product, and ({mathbf{S}}_{mathbf{a}}=left[begin{array}{ccc}{upsigma }_{{text{a}}1}^{2}& cdots & {upsigma }_{mathrm{a1,4}}\ vdots & ddots & vdots \ {upsigma }_{mathrm{a1,4}}& cdots & {upsigma }_{{text{a}}4}^{2}end{array}right]) and ({mathbf{S}}_{mathbf{e}}=left[begin{array}{ccc}{upsigma }_{{text{e}}1}^{2}& cdots & {upsigma }_{mathrm{e1,4}}\ vdots & ddots & vdots \ {upsigma }_{mathrm{e1,4}}& cdots & {upsigma }_{{text{e}}4}^{2}end{array}right]) are the additive genetic and residual (co)variance matrices, respectively. The G matrix was obtained according to VanRaden28: (mathbf{G}=frac{mathbf{M}{mathbf{M}}^{mathbf{^{prime}}}}{2sum_{{text{j}}=1}^{{text{m}}}{{text{p}}}_{{text{j}}}left(1-{{text{p}}}_{{text{j}}}right)}) where (mathbf{M}) is the SNP marker matrix with codes 0, 1, and 2 for genotypes AA, AB, and BB adjusted for allele frequency expressed as (2{{text{p}}}_{{text{j}}}), and ({{text{p}}}_{{text{j}}}) is the frequency of the second allele jth SNP marker.

The analyses were performed using the restricted maximum likelihood (REML) method through airemlf90 software29. The predictf90 software29 was used to obtain the phenotypes adjusted for the fixed effects and covariates (({{text{y}}}^{*}={text{y}}-{text{X}}widehat{upbeta })). The adjusted phenotypes were used as the response variable in the genomic predictions.

Tthe GEBVs accuracy (({{text{Acc}}}_{{text{GEBV}}})) in the whole population, was calculated based on prediction error variance (PEV) and the genetic variance for each FE-related trait (({upsigma }_{{text{a}}}^{2})) using the following equation30: ({text{Acc}}=1-sqrt{{text{PEV}}/{upsigma }_{{text{a}}}^{2}}) .

A forward validation scheme was applied for computing the prediction accuracies using machine learning and parametric methods, splitting the dataset based on year of birth, with animals born between 2004 and 2013 assigned as the reference population (n=836) and those born in 2014 and 2015 (n=188) as the validation set. For ML approaches, we randomly split the training dataset into fivefold to train the models.

Genomic prediction for FE-related traits considering the STGBLUP can be described as follows:

$${mathbf{y}}^{mathbf{*}}={varvec{upmu}}+mathbf{Z}mathbf{a}+mathbf{e}$$

where ({mathbf{y}}^{mathbf{*}}) is the Nx1 vector of adjusted phenotypic values for FE-related traits, (upmu) is the model intercept, (mathbf{Z}) is the incidence connecting observations; (mathbf{a}) is the vector of predicted values, assumed to follow a normal distribution given by ({text{N}}(0,{mathbf{G}}sigma_{a}^{2})) and (mathbf{e}) is the Nx1 vector of residual values considered normally distributed as ({text{N}}(0,mathbf{I}{upsigma }_{{text{e}}}^{2})), in which I is an identity matrix, ({upsigma }_{{text{e}}}^{2}) is the residual variance. The STGBLUP model was performed using blupf90+software29.

Genomic prediction for FE-related traits considering MTGBLUP can be described as follows:

$${mathbf{y}}^{mathbf{*}}={varvec{upmu}}+mathbf{Z}mathbf{a}+mathbf{e}$$

where ({mathbf{y}}^{mathbf{*}}) is the matrix of adjusted phenotypes of dimension Nx4, (upmu) is the trait-specific intercept vector, (mathbf{Z}) is the incidence matrix for the random effect; (mathbf{a}) is an Nx4 matrix of predicted values, assumed to follow a normal distribution given by ({text{MVN}}(0,{mathbf{G}} otimes {mathbf{S}}_{{mathbf{a}}})) where ({mathbf{S}}_{mathbf{a}}) represents genetic (co)variance matrix for the FE-related traits (44). The residual effects (e) were considered normally distributed as ({text{MVN}}(0,mathbf{I}otimes {mathbf{S}}_{mathbf{e}})) in which I is an identity matrix, and ({mathbf{S}}_{mathbf{e}}) is the residual (co)variance matrix for FE-related traits (44). The MTGBLUP was implemented in the BGLR R package14 considering a Bayesian GBLUP with a multivariate Gaussian model with an unstructured (co)variance matrix between traits (({mathbf{S}}_{mathbf{a}})) using Gibbs sampling with 200,000 iterations, including 20,000 samples as burn-in and thinning interval of 5 cycles. Convergence was checked by visual inspection of trace plots and distribution plots of the residual variance.

Five Bayesian regression models with different priors were used for GS analyses: Bayesian ridge regression (BRR), Bayesian Lasso (BL), BayesA, BayesB, and BayesC. The Bayesian algorithms for GS were implemented using the R package BGLR version 1.0914. The BGLR default priors were used for all models, with 5 degrees of freedom (dfu), a scale parameter (S), and . The Bayesian analyses were performed considering Gibbs sampling chains of 200,000 iterations, with the first 20,000 iterations excluded as burn-in and a sampling interval of 5 cycles. Convergence was checked by visual inspection of trace plots and distribution plots of the residual variance. For Bayesian regression methods, the general model can be described as follows:

$${mathbf{y}}^{mathbf{*}}=upmu +sum_{{text{w}}=1}^{{text{p}}}{{text{x}}}_{{text{iw}}}{{text{u}}}_{{text{w}}}+{{text{e}}}_{{text{i}}}$$

where (upmu) is the model intercept; ({{text{x}}}_{{text{iw}}}) is the genotype of the ith animal at locus w (coded as 0, 1, and 2); ({{text{u}}}_{{text{w}}}) is the SNP marker effect (additive) of the w-th SNP (p=305,128); and ({{text{e}}}_{{text{i}}}) is the residual effect associated with the observation of ith animal, assumed to be normally distributed as (mathbf{e}sim {text{N}}(0,{mathbf{I}upsigma }_{{text{e}}}^{2})).

The BRR method14 assumes a Gaussian prior distribution for the SNP markers (({{text{u}}}_{{text{w}}})), with a common variance ({(upsigma }_{{text{u}}}^{2})) across markers so that ({text{p}}left({{text{u}}}_{1},dots ,{{text{u}}}_{{text{w}}}|{upsigma }_{{text{u}}}^{2}right)=prod_{{text{w}}=1}^{{text{p}}}{text{N}}({{text{u}}}_{{text{w}}}{|0,upsigma }_{{text{u}}}^{2})). The variance of SNP marker effects is assigned a scaled-inverse Chi-squared distribution [({text{p}})(({upsigma }_{{text{u}}}^{2})={upchi }^{-2}({upsigma }_{{text{u}}}^{2}|{{text{df}}}_{{text{u}}},{{text{S}}}_{{text{u}}}))], and the residual variance is also assigned a scaled-inverse Chi-squared distribution with degrees of freedom (dfe)and scale parameters (Se).

Bayesian Lasso (BL) regression31 used an idea from Tibshirani32 to connect the LASSO (least absolute shrinkage and selection operator) method with the Bayesian analysis. In the BL, the source of variation is split intoresidual term(({upsigma }_{{text{e}}}^{2}))and variation due to SNP markers (({upsigma }_{{{text{u}}}_{{text{w}}}}^{2})). The prior distribution for the additive effect of the SNP marker (left[{text{p}}left({{text{u}}}_{{text{w}}}|{uptau }_{{text{j}}}^{2},{upsigma }_{{text{e}}}^{2}right)right]) follows a Gaussian distribution with marker-specific prior variance given by ({text{p}}left({{text{u}}}_{{text{w}}}|{uptau }_{{text{j}}}^{2},{upsigma }_{{text{e}}}^{2}right)=prod_{{text{w}}=1}^{{text{p}}}{text{N}}({{text{u}}}_{{text{w}}}left|0,{uptau }_{{text{j}}}^{2}{upsigma }_{{text{e}}}^{2}right)). This prior distribution leads to marker-specific shrinkage of their effect, whose their extent depends on the variance parameters (left({uptau }_{{text{j}}}^{2}right)). The variance parameters (left({uptau }_{{text{j}}}^{2}right)) is assigned as exponential independent and identically distributed prior,({text{p}}left( {{uptau }_{{text{j}}}^{2} left| {uplambda } right.} right) = mathop prod limits_{{{text{j}} = 1}}^{{text{p}}} {text{Exp}}left( {{uptau }_{{text{j}}}^{2} left| {{uplambda }^{2} } right.} right)) and the square lambda regularization parameter (({uplambda }^{2})) follows a Gamma distribution (({text{p}}left({uplambda }^{2}right)={text{Gamma}}({text{r}},uptheta ))), where r and (uptheta) are the rate and shape parameters, respectively31. Thus, the marginal prior for SNP markers is given by a double exponential (DE) distribution as follows: ({text{p}}left( {{text{u}}_{{text{w}}} left| {uplambda } right.} right) = int {{text{N}}left( {{text{u}}_{{text{w}}} left| {0,{uptau }_{{text{j}}}^{2} ,{upsigma }_{{text{e}}}^{2} } right.} right){text{Exp}}left( {{uptau }_{{text{j}}}^{2} left| {{uplambda }^{2} } right.} right)}), where the DE distribution places a higher density at zero and thicker tails, inducing stronger shrinkage of estimates for markers with relatively small effect and less shrinkage for markers with substantial effect. The residual variance (({upsigma }_{{text{e}}}^{2})) is specified as a scaled inverse chi-squared prior density, with degrees of freedom dfe and scale parameter Se.

BayesA method14,33 considers Gaussian distribution with null mean as prior for SNP marker effects (({{text{u}}}_{{text{w}}})), and a SNP marker-specific variance (({upsigma }_{{text{w}}}^{2})). The variance associated with each marker effect assumes a scaled inverse chi-square prior distribution, ({text{p}}left({upsigma }_{{text{w}}}^{2}right)={upchi }^{-2}left({upsigma }_{{text{w}}}^{2}|{{text{df}}}_{{text{u}}},{{text{S}}}_{{text{u}}}^{2}right)), with degrees of freedom (({{text{df}}}_{{text{u}}})) and scale parameter (({{text{S}}}_{{text{u}}}^{2})) treated as known14. Thus, BayesA places a t-distribution for the markers effects, i.e., ({text{p}}left({{text{u}}}_{{text{w}}}|{{text{df}}}_{{text{u}}},{{text{S}}}^{2}right)={text{t}}left(0,{{text{df}}}_{{text{u}}},{{text{S}}}_{{text{u}}}^{2}right)), providing a thicker-tail distribution compared to the Gaussian, allowing a higher probability of moderate to large SNP effects.

BayesB assumes that a known proportion of SNP markers have a null effect (i.e., a point of mass at zero), and a subset of markers with a non-null effect that follow univariate t-distributions3,12, as follows:

$${text{p}}left({{text{u}}}_{{text{w}}}|{text{df}},uppi ,{{text{df}}}_{{text{u}}},{S}_{B}^{2}right)=left{begin{array}{cc}0& mathrm{with probability pi }\ {text{t}}left({{text{u}}}_{{text{w}}}|{{text{df}}}_{{text{u}}},{S}_{B}^{2}right)& mathrm{with probability }left(1-uppi right)end{array}right.$$

where (uppi) is the proportion of SNP markers with null effect, and (1-uppi) is the probability of SNP markers with non-null effect contributing to the variability of the FE-related trait3. Thus, the prior distribution assigned to SNP with non-null effects is a scaled inverse chi-square distribution.

BayesC method34 assumes a spikeslab prior for marker effects, which refers to a mixture distribution comprising a fixed amount with probability (uppi) of SNP markers have a null effect, whereas a probability of 1 of markers have effects sampled from a normal distribution. The prior distribution is as follows:

$${text{p}}left({{text{u}}}_{{text{w}}},{upsigma }_{{text{w}}}^{2},uppi right)=left{prod_{{text{j}}=1}^{{text{w}}}left[uppi left({{text{u}}}_{{text{w}}}=0right)+left(1-uppi right){text{N}}(0,{upsigma }_{{text{w}}}^{2})right]*{upchi }^{-2}left({upsigma }_{{text{w}}}^{2}|{{{text{df}}}_{{text{u}}},mathrm{ S}}_{{text{B}}}^{2}right)*upbeta (uppi |{{text{p}}}_{0},{uppi }_{0}right},$$

Where ({upsigma }_{{text{w}}}^{2}) is the common variance for marker effect, ({{text{df}}}_{{text{u}}}) and ({{text{S}}}_{{text{B}}}^{2}) are the degrees of freedom and scale parameter, respectively, ({{text{p}}}_{0}) and ({uppi }_{0})[0,1] are the prior shape parameters of the beta distribution.

Two machine learning (ML) algorithms were applied for genomic prediction: Multi-layer Neural Network (MLNN) and support vector regression (SVR). The ML approaches were used to alleviate the standard assumption adopted in the linear methods, which restrict to additive genetic effects of markers without considering more complex gene action modes. Thus, ML methods are expected to improve predictive accuracy for different target traits. To identify the best combination of hyperparameters (i.e., parameters that must be tuned to control the learning process to obtain a model with optimal performance) in the supervised ML algorithms (MLNN and SVR), we performed a random grid search by splitting the reference population from the forward scheme into five-folds35.

In MLNN, handling a large genomic dataset, such as 305,128 SNPs, is difficult due to the large number of parameters that need to be estimated, leading to a significant increase in computational demand36. Therefore, an SNP pre-selection strategy based on GWAS results in the training population using an MTGBLUP method (Fig.1A) was used to reduce the number of markers to be considered as input on the MLNN. In addition, this strategy can remove noise information in the genomic data set. In this study, the traits displayed major regions explaining a large percentage of genetic variance, which makes using pre-selected markers useful37.

(A) Manhattan plot for percentage of genetic variance explained by SNP-marker estimated through multi-trait GWAS in training population to be used as pre-selection strategies for multi-layer neural network. (B) General representation of neural networks with two hidden layers used to model nonlinear dependencies between trait and SNP marker information. The input layer ((X={x}_{i,p})) considered in the neural network refers to the SNP marker information (coded as 0, 1, and 2) of the ith animal. The selected node represents the initial weight ((W={w}_{p})), assigned as random values between -0.5 and 0.5, connecting each input node to the first hidden layer and in the second layer the ({w}_{up}) refers to the output weight from the first hidden layer, b represents the bias which helps to control the values in the activation function. The output ((widehat{y})) layer represents a weighted sum of the input features mapped in the second layer.

The MLNN model can be described as a two-step regression38. The MLNN approach consists of three different layer types: input layer, hidden layer, and output layer. The input layer receives the input data, i.e., SNP markers. The hidden layer contains mapping processing units, commonly called neurons, where each neuron in the hidden layer computes a non-linear function (activation) of the weighted sum of nodes on the previous layer. Finally, the output layer provides the outcomes of the MLNN. Our proposed MLNN architecture comprises two fully connected hidden layers schematically represented in Fig.1B. The input layer in MLNN considered SNP markers that explained more than 0.125% of the genetic variance for FE-related traits (Fig.1A;~15k for ADG and DMI, and~16k for FE and RFI). The input covariate (X={{x}_{p}}) contains pre-selected SNP markers (p) with a dimension Nxp (N individuals and p markers). The pre-selected SNP markers are combined with each k neuron (with k=1, , Nr) through the weight vector ((W)) in the hidden layer and then summed with a neuron-specific bias (({b}_{k})) for computing the linear score for the neuron k as:({Z}_{i}^{[1]}=f({{b}_{k}}^{[1]}+X{W}^{[1]})) (Fig.1B). Subsequently, this linear score transformed using an activation function (fleft(.right)) to map k neuron-specific scores and produce the first hidden layer output ((fleft({z}_{1,i}right))). In the second-hidden layer, each neuron k receives a net input coming from hidden layer 1 as: ({Z}_{i}^{[2]}={{b}_{k}}^{left[2right]}+{Z}_{i}^{[1]}{W}^{[2]}), where ({W}^{[2]}) represents the weight matrix of dimension k x k (knumber of neurons) connecting the ({Z}_{i}^{[1]}) into the second hidden layer, and ({{b}_{k}}^{left[2right]}) is a bias term in hidden layer 2. Then, the activation function is applied to map the kth hidden neuron unit in the second hidden layer and generate the output layer as ({V}_{2,i}=fleft({z}_{2,i}right)). In the MLNN, a hyperbolic tangent activation function (({text{tanh}}left({text{x}}right)={{text{e}}}^{{text{x}}}-{{text{e}}}^{-{text{x}}}/{{text{e}}}^{{text{x}}}+{{text{e}}}^{-{text{x}}})) was adopted in the first and second layers, providing greater flexibility in the MLNN39.

The prediction of the adjusted FE-related trait was obtained as follows38:

$${mathbf{y}}^{mathbf{*}}=mathbf{f}left(mathbf{b}+{mathbf{V}}_{2,mathbf{i}}{mathbf{W}}_{0}right)+mathbf{e}$$

where ({mathbf{y}}^{mathbf{*}}) represents the target adjusted feed efficiency-related trait for the ith animal; (k) the number of neurons considered in the model and assumed the same in the first and second layer; ({mathbf{W}}_{0}) represents the weight from the k neuron in layer 2, (mathbf{b}) is related to the bias parameter. The optimal weights used in MLNN were obtained by minimizing the mean square error of prediction in the training subset40.

The MLNN model was implemented using the R package h2o (https://github.com/h2oai/h2o-3), with the random grid search using the h2o.grid function (https://cran.r-project.org/web/packages/h2o) to determine the number of neurons to maximize the prediction accuracy. We used the training population split into fivefold to assess the best neural network architecture and then apply it in the disjoint validation set41,42. We considered a total of 1000 epochs36, numbers of neurons ranging from 50 to 2500 with intervals of 100, and applied a dropout ratio of 0.2 and regularization L1 and L2 parameters as 0.0015 and 0.0005, respectively. In this framework, the MLNN was performed using two hidden layers of neural networks with the number of neurons (k) of 750 for ADG, 1035 for DMI, 710 for FE, and 935 for RFI obtained during the training process.

Support vector regression (SVR) is a kernel-based supervised learning technique used for regression analysis43. In the context of GS, the SVR uses linear models to implement nonlinear regression by mapping the predictor variables (i.e., SNP marker) in the feature space using different kernel functions (linear, polynomial, or radial basis function) to predict the target information, e.g., adjusted phenotype the GS44. SVR can map linear or nonlinear relationships between phenotypes and SNP markers depending on the kernel function. The best kernel function mapping genotype to phenotype (linear, polynomial, and radial basis) was determined using the training subset split into fivefold. The radial basis function (RBF) was chosen as it outperformed the linear and polynomial (degree equal 2) kernels in the training process, increasing 8.25% in predictive ability and showing the lowest MSE.

The general model for SVR using a RBF function can be described as38,45: ({mathbf{y}}_{mathbf{i}}^{mathbf{*}}=mathbf{b}+mathbf{h}{left(mathbf{m}right)}^{mathbf{T}}mathbf{w}+mathbf{e}), where (mathbf{h}{left(mathbf{m}right)}^{mathbf{T}}) represents the kernel radial basis function used to transform the original predictor variables, i.e. SNP marker information (({text{m}})), (b) denotes the model bias, and (w) represents the unknown regression weight vector. In the SVR, the learn function (mathbf{h}{left(mathbf{m}right)}^{mathbf{T}}) was given by minimizing the loss function. The SVR was fitted using an epsilon-support vector regression that ignores residual absolute value ((left|{y}_{i}^{*}-{widehat{y}}_{i}^{*}right|)) smaller than some constant () and penalize larger residuals46.

The kernel RBF function considered in the SVR follows the form: (mathbf{h}{left(mathbf{m}right)}^{mathbf{T}}=mathbf{exp}left(-{varvec{upgamma}}{Vert {mathbf{m}}_{mathbf{i}}-{mathbf{m}}_{mathbf{j}}Vert }^{2}right)), where the ({varvec{upgamma}}) is a gamma parameter to quantity the shapes of the kernel functions, (m)and({m}_{i}) are the vectors of predictor variables for labels i and j. The main parameters in SVR are the cost parameter (({text{C}})), gamma parameter (({varvec{upgamma}})), and epsilon ((upepsilon)). The parameters ({text{C}}) and (upepsilon) were defined using the training data set information as proposed by Cherkasky and Ma47: ({text{C}}={text{max}}left(left|overline{{{text{y}} }^{*}}+3{upsigma }_{{{text{y}}}^{*}}right|,left|overline{{{text{y}} }^{*}}-3{upsigma }_{{{text{y}}}^{*}}right|right)) and (upepsilon =3{upsigma }_{{{text{y}}}^{*}}left(sqrt{{text{ln}}left({text{n}}right)/{text{n}}}right)), in which the (overline{{{text{y}} }^{*}}) and ({upsigma }_{{{text{y}}}^{*}}) are the mean and the standard deviation of the adjusted FE-related traits on the training population, and n represents the number of animals in the training set. The gamma () was determined through a random search of values varying from 0 to 5, using the training folder split into fivefold. The better-trained SVR model considered the parameter of 2.097 for ADG, 0.3847 for DMI, 0.225 for FE, and 1.075 for RFI. The SVR was implemented using the e1071 R package48.

Prediction accuracy (acc) of the different statistical approaches was assessed by Pearsons correlation between adjusted phenotypes (({{text{y}}}^{*})) and their predicted values (({widehat{{text{y}}}}_{{text{i}}}^{*})) on the validation set, and root mean squared error (RMSE). The prediction bias was assessed using the slope of the linear regression of ({widehat{y}}_{i}^{*}) on ({{text{y}}}^{*}), for each model. The Hotelling-Williams test49 was used to assess the significance level of the difference in the predictive ability of Bayesian methods (BayesA, BayesB, BayesC, BL, and BRR), MTGBLUP, and machine learning (MLNN and SVR) against STGBLUP. The similarity between the predictive performance of the different models was assessed using Wards hierarchical clustering method with an Euclidian distance analysis. The relative difference (RD) in the predictive ability was measured as ({text{RD}}=frac{({{text{r}}}_{{text{m}}}-{{text{r}}}_{{text{STGBLUP}}})}{{{text{r}}}_{{text{STGBLUP}}}}times 100), where ({{text{r}}}_{{text{m}}}) represents the acc of each alternative approach (SVR, MLNN, and MTGBLUP, or Bayesian regression modelsBayesA, BayesB, BayesC, BL, and BRR), and ({{text{r}}}_{{text{STGBLUP}}}) is the predictive ability obtained using the STGBLUP method.

The animal procedures and data sampling presented in this study were approved and performed following the Animal Care and Ethical Committee recommendations of the So Paulo State University (UNESP), School of Agricultural and Veterinary Science (protocol number 18.340/16).

Read this article:
Benchmarking machine learning and parametric methods for genomic prediction of feed efficiency-related traits in ... - Nature.com