What is Apache Spark? The big data platform that crushed Hadoop
Computerworld | Big Data
by Ian Pointer
4y ago
Apache Spark defined Apache Spark is a data processing framework that can quickly perform processing tasks on very large data sets, and can also distribute data processing tasks across multiple computers, either on its own or in tandem with other distributed computing tools. These two qualities are key to the worlds of big data and machine learning, which require the marshalling of massive computing power to crunch through large data stores. Spark also takes some of the programming burdens of these tasks off the shoulders of developers with an easy-to-use API that abstracts away much of the gr ..read more
Visit website
What is natural language processing? AI for speech and text
Computerworld | Big Data
by Martin Heller
4y ago
From a friend on Facebook: Me: Alexa please remind me my morning yoga sculpt class is at 5:30am. Alexa: I have added Tequila to your shopping list. We talk to our devices, and sometimes they recognize what we are saying correctly. We use free services to translate foreign language phrases encountered online into English, and sometimes they give us an accurate translation. Although natural language processing has been improving by leaps and bounds, it still has considerable room for improvement. [ Make sense of machine learning: AI, machine learning, and deep learning: Everything you need to ..read more
Visit website
What is deep learning? Algorithms that mimic the human brain
Computerworld | Big Data
by Martin Heller
4y ago
Deep learning defined Deep learning is a form of machine learning that models patterns in data as complex, multi-layered networks. Because deep learning is the most general way to model a problem, it has the potential to solve difficult problems—such as computer vision and natural language processing—that outstrip both conventional programming and other machine learning techniques. Deep learning not only can produce useful results where other methods fail, but also can build more accurate models than other methods, and can reduce the time needed to build a useful model. However, training deep ..read more
Visit website
What is big data analytics? Fast answers from diverse data sets
Computerworld | Big Data
by Bob Violino
4y ago
There’s data, and then there’s big data. So, what’s the difference? Big data defined A clear big data definition can be difficult to pin down because big data can cover a multitude of use cases. But in general the term refers to sets of data that are so large in volume and so complex that traditional data processing software products are not capable of capturing, managing, and processing the data within a reasonable amount of time. These big data sets can include structured, unstructured, and semistructured data, each of which can be mined for insights. How much data actually constitutes “big ..read more
Visit website
Semi-supervised learning explained
Computerworld | Big Data
by Martin Heller
5y ago
In his 2017 Amazon shareholder letter, Jeff Bezos wrote something interesting about Alexa, Amazon’s voice-driven intelligent assistant: In the U.S., U.K., and Germany, we’ve improved Alexa’s spoken language understanding by more than 25% over the last 12 months through enhancements in Alexa’s machine learning components and the use of semi-supervised learning techniques. (These semi-supervised learning techniques reduced the amount of labeled data needed to achieve the same accuracy improvement by 40 times!) Given those results, it might be interesting to try semi-supervised learning on our ..read more
Visit website
Automated machine learning or AutoML explained
Computerworld | Big Data
by Martin Heller
5y ago
The two biggest barriers to the use of machine learning (both classical machine learning and deep learning) are skills and computing resources. You can solve the second problem by throwing money at it, either for the purchase of accelerated hardware (such as computers with high-end GPUs) or for the rental of compute resources in the cloud (such as instances with attached GPUs, TPUs, and FPGAs). To read this article in full, please click here ..read more
Visit website
HPE plus MapR: Too much Hadoop, not enough cloud
Computerworld | Big Data
by Matt Asay
5y ago
Cloud killed the fortunes of the Hadoop trinity—Cloudera, Hortonworks, and MapR—and that same cloud likely won’t rain success down on HPE, which recently acquired the business assets of MapR. While the deal promises to marry “MapR’s technology, intellectual property, and domain expertise in artificial intelligence and machine learning (AI/ML) and analytics data management” with HPE’s “Intelligent Data Platform capabilities,” the deal is devoid of the one ingredient that both companies need most: cloud. To read this article in full, please click here ..read more
Visit website
Insider Pro guide to big data certifications
Computerworld | Big Data
by Neal Weinberg
5y ago
Whether you’re a veteran IT manager looking to jumpstart your career, a young IT pro who wants to advance to the next great opportunity or an executive on the business side who wants to help the company optimize its business intelligence (BI) capabilities, all roads lead to big data. By every measure, big data is the fastest growing area within IT. The big data job category has the most job openings, the top salary and the highest job satisfaction ratings. Specific job titles within the broad big data category include data scientist, data architect and data engineer. To read this article in f ..read more
Visit website
Julia vs. Python: Which is best for data science?
Computerworld | Big Data
by Serdar Yegulalp
5y ago
Among the many use cases Python covers, data analytics has become perhaps the biggest and most significant. The Python ecosystem is loaded with libraries, tools, and applications that make the work of scientific computing and data analysis fast and convenient.But for the developers behind the Julia language — aimed specifically at “scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing”—Python isn’t fast or convenient enough. Python represents a trade-off, good for some parts of data analytics work but terrible for others.[ Get start ..read more
Visit website
Supervised learning explained
Computerworld | Big Data
by Martin Heller
5y ago
Machine learning is a branch of artificial intelligence that includes algorithms for automatically creating models from data. At a high level, there are four kinds of machine learning: supervised learning, unsupervised learning, reinforcement learning, and active machine learning. Since reinforcement learning and active machine learning are relatively new, they are sometimes omitted from lists of this kind. You could also add semi-supervised learning to the list, and not be wrong.To read this article in full, please click here ..read more
Visit website

Follow Computerworld | Big Data on FeedSpot

Continue with Google
Continue with Apple
OR