Fraud is definitely top of mind for all banks. Steve Rosenbush at the Wall
Street Journal recently wrote about Visa’s new Big Data analytic engine
which has changed the way the company combats fraud. Visa estimates that its
new Big Data fraud platform has identified $2 billion in potential annual
incremental fraud savings. With Big Data, their new analytic engine can study
as many as 500 aspects of a transaction at once. That’s a sharp improvement
from the company’s previous analytic engine, which could study only 40
aspects at once. And instead of using just one analytic model, Visa now
operates 16 models, covering different segments of its market, such as
geographic regions.
Do you think Visa, or any bank for that matter, uses just batch analytics to
provide fraud detection? Hadoop can play a significant role in building
models. However, only a real-time solution... (more)
Have you heard of products like IBM’s InfoSphere Streams, Tibco’s Event
Processing product, or Oracle’s CEP product? All good examples of
commercially available stream processing technologies which help you process
events in real-time.
I’ve been asked what I consider as “Big Data” versus “Small Data”
in this domain. Here’s my view.
Real-Time Analytics Small Data Big Data Data Volume None None Data Velocity
100K events / day (<<1K events / second) Billion+ events / day (>>1K events /
second) Data Variety 1-6 unstructured on sources AND 1 single destination (an
output file, a SQ... (more)
Do you think that you’re working with “Big Data”? or is it “Small
Data”? If you’re asking ad hoc questions of your data, you’ll probably
need something that supports “query-response” performance or, in other
words, “near real-time”. We’re not talking about batch analytics, but
more interactive / iterative analytics. Think NoSQL, or “near real-time
Hadoop” with technologies like Impala. Here’s my view of Big versus Small
with ad hoc analytics in either case.
Ad Hoc Analytics Small Data Big Data Data Volume Megabytes – Gigabytes
Terabytes (1-100TB) Data Velocity Update in near rea... (more)
How do you know whether you are dealing with Big Data or Small Data? I’m
constantly asked for my definition of “Big Data”. Well, here it is…for
batch analytics.
Batch Analytics
Batch Analytics Small Data Big Data Data Volume Gigabytes Terabytes –
Petabytes Data Sources 1-6 (structured – SQL, or unstructured – NoSQL) 6+
structured AND 6+ unstructured Business Functions One line of business (e.g.
sales) Several lines of business all the way up to a 360 degree view of the
business Business Questions Queries are complex requiring many concurrent
data modifications, a rich breadth ... (more)
I was talking to one of the prominent General Partners at a Venture Capital
firm here in Silicon Valley over the holidays…discussing how the Cloud
market is evolving. We both agreed that 2013 will mark yet another shift in
the evolution of web applications.
I personally simplified this view of cloud evolution by defining its
progression in the following periods...dating back to these noteworthy
events:
ISP Era: Software Tool & Die is founded in 1989 ASP to SaaS Era:
TeleComputing founder coins ASPs in 1996 IaaS & PaaS Era: Amazon EC2 is
Launched in 2006 Analytics Application Er... (more)