Placeholder Image

Subtitles section Play video

  • >> [Narrator] Live from New York, it's The Cube

  • covering the IBM Machine Learning Launch Event

  • brought to you by IBM.

  • Here are your hosts, Dave Vellante and Stu Miniman.

  • >> Good morning everybody, welcome to the Waldorf Astoria.

  • Stu Miniman and I are here in New York City,

  • the Big Apple,

  • for IBM's Machine Learning Event #IBMML.

  • We're fresh off Spark Summit, Stu,

  • where we had The Cube, this by the way is The Cube,

  • the worldwide leader in live tech coverage.

  • We were at Spark Summit last week,

  • George Gilbert and I,

  • watching the evolution of so-called big data.

  • Let me frame, Stu, where we're at

  • and bring you into the conversation.

  • The early days of big data were all about

  • offloading the data warehouse and reducing the cost

  • of the data warehouse.

  • I often joke that the ROI of big data

  • is reduction on investment, right?

  • There's these big, expensive data warehouses.

  • It was quite successful in that regard.

  • What then happened is we started to throw

  • all this data into the data warehouse.

  • People would joke it became a data swamp,

  • and you had a lot of tooling

  • to try to clean the data warehouse

  • and a lot of transforming and loading

  • and the ETL vendors started to participate there

  • in a bigger way.

  • Then you saw the extension of these data pipelines

  • to try to more with that data.

  • The Cloud guys have now entered in a big way.

  • We're now entering the Cognitive Era,

  • as IBM likes to refer to it.

  • Others talk about AI and machine learning

  • and deep learning,

  • and that's really the big topic here today.

  • What we can tell you, that the news goes out

  • at 9:00am this morning, and it was well known

  • that IBM's bringing machine learning

  • to its mainframe, z mainframe.

  • Two years ago, Stu, IBM announced the z13,

  • which was really designed to bring

  • analytic and transaction processing together

  • on a single platform.

  • Clearly IBM is extending the useful life

  • of the mainframe by bringing things like Spark,

  • certainly what it did with Linux

  • and now machine learning into z.

  • I want to talk about Cloud, the importance of Cloud,

  • and how that has really taken over the world of big data.

  • Virtually every customer you talk to now

  • is doing work on the Cloud.

  • It's interesting to see now

  • IBM unlocking its transaction base,

  • its mission-critical data,

  • to this machine learning world.

  • What are you seeing around Cloud and big data?

  • >> We've been digging into this big data space

  • since before it was called big data.

  • One of the early things that really got me

  • interested and exciting about it is,

  • from the infrastructure standpoint,

  • storage has always been one of its costs

  • that we had to have,

  • and the massive amounts of data,

  • the digital explosion we talked about,

  • is keeping all that information

  • or managing all that information

  • was a huge challenge.

  • Big data was really that bit flip.

  • How do we take all that information

  • and make it an opportunity?

  • How do we get new revenue streams?

  • Dave, IBM has been at the center of this

  • and looking at the higher-level pieces

  • of not just storing data, but leveraging it.

  • Obviously huge in analytics, lots of focus

  • on everything from Hadoop and Spark and newer technologies,

  • but digging in to how they can leverage up the stack,

  • which is where IBM has done a lot of acquisitions

  • in that space and leveraging that

  • and wants to make sure that they have a strong position

  • both in Cloud, which was renamed.

  • The soft layer is now IBM Bluemix

  • with a lot of services

  • including a machine learning service

  • that leverages the Watson technology

  • and of course OnPrem they've got the z

  • and the power solutions

  • that you and I have covered for many years

  • at the IBM Med show.

  • >> Machine learning obviously heavily leverages models.

  • We've seen in the early days of the data,

  • the data scientists would build models

  • and machine learning allows those models

  • to be perfected over time.

  • So there's this continuous process.

  • We're familiar with the world of Batch

  • and then some mini computer brought in

  • the world of interactive,

  • so we're familiar with those types of workloads.

  • Now we're talking about a new emergent workload

  • which is continuous.

  • Continuous apps where you're streaming data in,

  • what Spark is all about.

  • The models that data scientists are building

  • can constantly be improved.

  • The key is automation, right?

  • Being able to automate that whole process,

  • and being able to collaborate

  • between the data scientist, the data quality engineers,

  • even the application developers

  • that's something that IBM really tried to address

  • in its last big announcement in this area

  • of which was in October of last year

  • the Watson data platform,

  • what they called at the time the DataWorks.

  • So really trying to bring together

  • those different personas

  • in a way that they can collaborate together

  • and improve models on a continuous basis.

  • The use cases that you often hear in big data

  • and certainly initially in machine learning

  • are things like fraud detection.

  • Obviously ad serving has been a big data application

  • for quite some time.

  • In financial services, identifying good targets,

  • identifying risk.

  • What I'm seeing, Stu, is that the phase that we're in now

  • of this so-called big data and analytics world,

  • and now bringing in machine learning and deep learning,

  • is to really improve on some of those use cases.

  • For example, fraud's gotten much, much better.

  • Ten years ago, let's say, it took many, many months,

  • if you ever detected fraud.

  • Now you get it in seconds, or sometimes minutes,

  • but you also get a lot of false positives.

  • Oops, sorry, the transaction didn't go through.

  • Did you do this transaction?

  • Yes, I did.

  • Oh, sorry, you're going to have to redo it

  • because it didn't go through.

  • It's very frustrating for a lot of users.

  • That will get better and better and better.

  • We've all experienced retargeting from ads,

  • and we know how crappy they are.

  • That will continue to get better.

  • The big question that people have

  • and it goes back to Jeff Hammerbacher,

  • the best minds of my generation

  • are trying to get people to click on ads.

  • When will we see big data really start

  • to affect our lives in different ways

  • like patient outcomes?

  • We're going to hear some of that today

  • from folks in health care and pharma.

  • Again, these are the things that people are waiting for.

  • The other piece is, of course, IT.

  • What you're seeing, in terms of IT,

  • in the whole data flow?

  • >> Yes, a big question we have, Dave, is

  • where's the data?

  • And therefore, where does it make sense

  • to be able to do that processing?

  • In big data we talked about you've got

  • masses amounts of data,

  • can we move the processing to that data?

  • With IT, the day before, your RCTO talked that

  • there's going to be massive amounts of data at the edge

  • and I don't have the time or the bandwidth

  • or the need necessarily