Data - CO/AI

News/Data

Oct 2, 2024

Training an AI model? Mostly AI will take your real data and privacy-proof it

AI privacy innovation: Mostly AI has launched a synthetic text functionality that generates privacy-protected data for training enterprise AI models, addressing concerns about using real customer data containing personally identifiable information. The new tool automates the process of creating synthetic data while preserving the patterns of original datasets, allowing businesses to leverage customer insights without risking privacy. Synthetic data can also be used to rebalance datasets, remove bias, and generate mock data for software testing. How the technology works: Mostly AI's platform allows companies to upload proprietary datasets and fine-tune generators to create privacy-protected, synthesized versions of their data. Users...

read Oct 1, 2024

AI data access breakthrough from Nvidia and VAST sparks curiosity

AI data access breakthrough: VAST Data, in collaboration with Nvidia, has introduced a novel approach for businesses to efficiently and securely utilize AI models while leveraging internal corporate information. The new product, VAST InsightEngine, aims to streamline the adoption of large generative AI models for businesses, particularly in creating chatbots that can access and interpret vast amounts of company data. This innovation addresses a significant challenge in one of the most common business use cases for AI: enabling employees to find and make sense of extensive corporate data repositories. Key features and benefits: VAST InsightEngine offers a seamless solution for...

read Oct 1, 2024

Startup aims to pay YouTubers for AI training data

A new frontier for content creators: Calliope Networks, an AI-focused content licensing startup, is pioneering a program called "License to Scrape" that aims to revolutionize how YouTube creators can monetize their content when used for AI training. The program seeks to bridge the gap between YouTube creators and AI companies, allowing content creators to be compensated when their videos are used to train artificial intelligence systems. Unlike platforms such as Reddit, YouTube has not yet established formal agreements with AI companies for content scraping, creating an opportunity for third-party solutions. Calliope Networks' CEO, Dave Davis, envisions the program as a...

read Oct 1, 2024

Are community-trained AI models the future of LLM development?

The open source AI revolution: Nous Research, a pioneering organization in open source AI, is spearheading efforts to democratize AI model training and development through innovative projects like DisTrO. Nous Research, led by Bowen Peng and Jeffrey Quesnelle, is focused on accelerating open source AI research and empowering independent builders in the AI community. The organization's latest project, DisTrO, demonstrates the feasibility of training AI models across the public internet at unprecedented speeds. Nous Research is also behind other successful open source AI initiatives, including the Hermes family of "neutral" and guardrail-free language models. The DisTrO project: Addressing potential setbacks...

read Oct 1, 2024

Microsoft unveils framework for data-enhanced AI apps

A new framework for categorizing RAG tasks: Microsoft researchers have proposed a four-level framework for categorizing retrieval-augmented generation (RAG) tasks for large language models (LLMs), based on the complexity of external data retrieval and reasoning required. The framework aims to help enterprises make informed decisions about integrating external knowledge into LLMs and understanding when more complex systems may be necessary. The categorization ranges from simple explicit fact retrieval to complex hidden rationale queries requiring domain-specific reasoning. This approach recognizes the varying levels of sophistication needed for different types of user queries and LLM applications. Breaking down the four-level categorization: The...

read Sep 30, 2024

AI-powered notebook rival built in 24 hours challenges Google

Open-source AI challenges Google's NotebookLM: A data scientist in Singapore has created an open-source alternative to Google's NotebookLM, highlighting the growing capabilities of individual developers in the AI space. Rapid development and key features: Gabriel Chua, a data scientist at Singapore's GovTech agency, built "Open NotebookLM" in just one afternoon using publicly available AI models. The tool transforms PDF documents into personalized podcasts, mirroring a key feature of Google's NotebookLM. It utilizes Meta's Llama 3.1 405B language model and MeloTTS for voice synthesis. A user-friendly interface built with Gradio and hosted on Hugging Face Spaces makes the tool accessible to...

read Sep 30, 2024

NetApp unveils ambitious AI-driven data management strategy

NetApp's AI-driven data management vision: NetApp has unveiled a new strategy to address the data challenges faced by enterprises deploying AI, combining its existing ONTAP-based products with a disaggregated architecture and enhanced data manipulation capabilities. The company introduced its "Data First" AI vision at the recent NetApp Insight customer event, focusing on improving the efficiency of AI workloads through better data management. NetApp's approach aims to solve two key challenges: scalability issues in traditional legacy systems and the inefficiencies associated with moving data between storage and external tools in the AI lifecycle. Key components of NetApp's new strategy: Disaggregated Storage...

read Sep 27, 2024

How machine learning is helping to predict the next epidemic

The evolving landscape of epidemic forecasting: The COVID-19 pandemic has underscored the critical importance of accurate and timely epidemic forecasting for decision-makers across various sectors, prompting significant advancements in data-driven computational approaches. The emergence of new data sources, including symptomatic online surveys, retail and commerce data, mobility information, and genomics data, has expanded the scope and potential of epidemic forecasting. These novel data streams enable more sophisticated and nuanced predictions, allowing for a more comprehensive understanding of disease spread and potential interventions. Key methodological advances: Machine learning and data-centric approaches are at the forefront of recent developments in epidemic forecasting,...

read Sep 27, 2024

How IT leaders are approaching AI for data management

AI in data management: A measured approach: IT leaders are carefully evaluating the role of artificial intelligence, particularly machine learning and generative AI, in enhancing data management practices within their organizations. The focus is on leveraging digital data to improve customer experiences and operational efficiency, with a keen eye on demonstrating clear business value. IT leaders are selectively implementing AI technologies, prioritizing use cases that offer tangible benefits and align with their specific business needs. The adoption of generative AI remains cautious, with many organizations still in the testing phase or focusing on internal applications rather than customer-facing implementations. Retail...

read Sep 27, 2024

For successful AI implementation follow these 3 data strategies

The imperative of robust data strategies for AI adoption: Business leaders are emphasizing the critical importance of establishing strong data foundations as a prerequisite for successful AI implementation, highlighting three key approaches to achieve this goal. The growing interest in AI technologies has underscored the need for organizations to prioritize their data strategies before diving into AI adoption. Leaders from prominent organizations such as L&G, DWF, and the North Sea Transition Authority have shared insights on effective methods for building robust data foundations. People-centric approach to data strategy: Placing employees at the forefront of data initiatives is crucial for creating...

read Sep 27, 2024

HBR: Why data collectives are the next frontier of labor relations

The AI revolution and labor dynamics: The rapid advancement of artificial intelligence, particularly generative AI, is creating new tensions between companies and employees across various industries. Executives are enthusiastic about AI's potential to transform businesses and increase productivity, while white-collar workers are apprehensive about its impact on their job security and future prospects. The Writers Guild of America strike has already highlighted conflicts over AI usage in the entertainment industry, foreshadowing potential disputes in other sectors. As AI becomes more integrated into business operations, the value of high-quality data for training AI systems is increasing, making employee-generated data an increasingly...

read Sep 26, 2024

Airtable’s new AI platform transforms workplace productivity

Airtable's AI-powered enterprise platform: Airtable has launched new capabilities that transform its collaborative app-building platform into an enterprise-grade AI solution, aiming to help companies deploy AI into critical business workflows at scale. The San Francisco-based company introduced App Library, which allows organizations to create standardized AI-powered applications that can be customized across different departments. HyperDB, another new feature, enables integration of massive datasets of over 100 million records from systems like Snowflake and Salesforce. Airtable's platform is already used by major media, retail, and financial services companies to power critical operations. Addressing the AI deployment challenge: Airtable's move comes as...

read Sep 24, 2024

How Wolters Kluwer is Bringing AI to Electronic Health Records

Revolutionizing clinical decision support: Wolters Kluwer Health is integrating its popular UpToDate tool into electronic health records (EHRs) through Wellsheet's AI-powered interface, aiming to streamline doctors' access to critical medical information. Concord Hospital Health System in New Hampshire will be the first to pilot this integration, combining UpToDate's trusted clinical content with Wellsheet's EHR-embedded interface. The integration aims to provide physicians with contextually relevant information directly within their workflow, potentially reducing cognitive burden and improving efficiency. UpToDate, with over 3 million users worldwide, is one of the most widely used knowledge resources for healthcare professionals, offering curated and vetted clinical...

read Sep 24, 2024

OpenAI Launches Multilingual Dataset to Enhance Global AI Performance

OpenAI's release of a multilingual AI dataset marks a significant advancement in expanding the global reach of artificial intelligence, particularly in languages with limited AI training resources. Bridging the language gap: OpenAI has unveiled the Multilingual Massive Multitask Language Understanding (MMMLU) dataset, evaluating AI performance across 14 diverse languages. The dataset includes Arabic, German, Swahili, Bengali, and Yoruba, addressing criticisms of the AI industry's focus on primarily English-based models. MMMLU builds upon the existing Massive Multitask Language Understanding (MMLU) benchmark, which tested AI knowledge across 57 disciplines but only in English. The new dataset has been made available on the...

read Sep 23, 2024

LinkedIn AI Backlash Highlights Need for EU-Like Privacy Protections

LinkedIn's AI training sparks privacy concerns: LinkedIn's decision to use user data for training its AI tools has ignited a debate about data privacy and user consent in the tech industry. The professional networking platform has begun using member data to improve its AI capabilities, a move that has drawn criticism from users concerned about privacy and transparency. This decision follows similar actions by other tech giants like Meta (Facebook, Instagram) and X (Twitter), who have also leveraged user data for AI development. Notably, LinkedIn has excluded users in the European Union, European Economic Area, and Switzerland from this data...

read Sep 23, 2024

How Altimate AI Is Using Agentic AI to Transform Enterprise Data Management

AI-powered data operations revolution: Altimate AI's new DataMates technology aims to transform enterprise data management by leveraging agentic AI to automate and accelerate a wide range of tasks. The San Francisco-based startup has introduced DataMates as part of its DataPilot platform, designed to address the growing challenges faced by overworked and understaffed enterprise data teams. DataMates acts as virtual teammates, using AI to handle time-consuming and repetitive tasks such as data documentation, testing, and performance optimization. The technology aims to reduce the burden on data professionals, allowing them to focus on higher-level tasks and meet business requirements more efficiently. The...

read Sep 23, 2024

Cloudflare Launches AI Bot Detection Tools to Empower Website Owners

Cloudflare's new AI bot detection and blocking tools mark a significant shift in the landscape of web content protection and AI data acquisition. This development empowers website owners with unprecedented control over AI bot access to their content, potentially reshaping the dynamics between content creators and AI companies. Revolutionary web protection: Cloudflare has introduced free AI auditing tools called Bot Management, available to all customers, including its 33 million free users. The tools enable real-time monitoring of AI crawlers visiting websites and scraping data, providing website owners with valuable insights into AI bot activity. Customers can selectively block AI bots...

read Sep 21, 2024

LinkedIn Halts UK Data Use for AI Training Amid Privacy Concerns

AI training suspended for UK LinkedIn users: LinkedIn has halted the use of UK user data for training its artificial intelligence models following concerns raised by the Information Commissioner's Office (ICO). The Microsoft-owned professional networking platform had previously included users worldwide in its AI training data collection without explicit consent. Stephen Almond, executive director of the ICO, expressed satisfaction with LinkedIn's decision to pause the use of UK users' information for AI training purposes. LinkedIn stated its willingness to engage further with the ICO on this matter. The broader context of AI data collection: Many tech giants, including LinkedIn, are...

read Sep 20, 2024

LinkedIn is Training its AI Models on Your Data — Here’s How to Opt Out

LinkedIn's AI training initiative: LinkedIn has implemented a new policy that allows the company to use user data for training generative AI models, with users automatically opted in without explicit consent. The professional networking platform introduced a new privacy setting and opt-out form before updating its privacy policy to reflect this change. LinkedIn states that it uses generative AI for features such as writing assistance, but the extent of data usage and potential applications remains unclear. This move follows a recent admission by Meta that it has been scraping non-private user data for AI model training since 2007. Opting out...

read Sep 20, 2024

What IT Leaders Should Know About Google’s NotebookLM Upgrade

AI-powered research tool evolves: Google's NotebookLM, initially launched in July 2023, has expanded its capabilities and found increasing applications in enterprise settings. NotebookLM allows users to upload various file types, including PDFs, websites, Google Docs, and Google Slides, into a single notebook for easy information management. The tool utilizes Google's Gemini AI to answer questions about the documents within the notebook, enhancing research and analysis capabilities. Since its general availability in the U.S. in December 2023, NotebookLM has seen growing adoption among corporate teams for sharing research and information. New audio generation feature: Google has introduced a novel capability that...

read Sep 19, 2024

How Data, Machine Learning and AI are Transforming Industries

The digital age love triangle: Data, machine learning, and artificial intelligence form a powerful alliance that is reshaping industries and driving innovation across various sectors. This unique relationship has the potential to revolutionize decision-making processes, unlock hidden insights, and tackle complex challenges in ways previously thought impossible. The synergy between these three elements is increasingly becoming the cornerstone of progress in fields such as healthcare, finance, marketing, and transportation. Data as the foundation: Data serves as the critical base upon which machine learning and artificial intelligence build their capabilities, providing the raw material for analysis and decision-making. The growing recognition...

read Sep 19, 2024

Alation and Salesforce Unite to Boost Enterprise Data Governance

Data governance partnership unveiled: Alation, a data intelligence platform vendor, has joined forces with Salesforce to enhance data governance across enterprises through integration with Salesforce's Data Cloud. The collaboration introduces bidirectional integration between Alation's platform and Salesforce's Data Cloud, enabling comprehensive data governance and end-to-end lineage tracking. Companies can now access crucial metadata from over 100 data sources within Data Cloud, expanding their data visibility and management capabilities. Alation has developed a dedicated connector to Salesforce Data Cloud and is committed to deepening connectivity as new features emerge. Enhanced AI capabilities: The partnership aims to improve AI functionalities within the...

read Sep 18, 2024

Library of Congress Is a Go-To Data Source for Companies Training AI Models

The Library of Congress: A new frontier for AI development: The world's largest library has become an attractive resource for AI companies seeking to train their advanced language models using its vast digital archives. The Library of Congress (LOC) houses 180 million items, including books, manuscripts, maps, and audio recordings, with 185 petabytes of digital data. AI companies are increasingly interested in accessing this data to develop and train their most sophisticated AI models. The library's digital collections offer rare, original, and authoritative information in over 400 languages, spanning various disciplines. Surge in data access requests: The Library of Congress...

read Sep 18, 2024

How AI Tools Can Save Healthcare When It Comes To Managing Patient Data

AI's potential in healthcare: Artificial intelligence is emerging as a promising tool to address challenges in the medical field, particularly in managing the vast amounts of patient data that have accumulated over the past two decades. Dr. Pete Clardy, senior clinical specialist at Google Health, highlights the "fragmentation problem" in health data, where complex information is scattered across different locations and formats. AI tools are being developed to sort and summarize this data for doctors, potentially streamlining their workflow and improving patient care. However, the healthcare industry's cautious approach to new technology stems from concerns about patient safety and the...

read