Sessions Agenda
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

View only
 
ABCDEFGHIJKLMNOPQRSTUVWXYZAA
1
Session NameSpeakerCompanyTrackTimeRoomSponsoredDescription
2
Real Time Analytics with DruidGuillaume TorcheGumGumBig Data10:30 AMGC-150GumGum uses Druid to ingest more than 30 billion events every day, which can be queried almost as soon as they happen with a very low response time. This is a tell-all talk about GumGum's love story with Druid, how Druid works and how GumGum leverages Druid's capabilities.
3
Portable Steam and Batch processing with Apache Beam and Google Cloud DataflowEric AndersonGoogleBig Data11:10 AMLibrary 4th floorThis talk explores deploying a series of small and large batch and streaming pipelines locally, to Spark and Flink clusters and to Google Cloud Dataflow services to give the audience a feel for the portability of Beam, a new portable Big Data processing framework recently submitted by Google to the Apache foundation. This talk will look at how the programming model handles late arriving data in a stream with event time, windows, and triggers.
4
Building scalable enterprise data flows and IoT apps using Apache NiFiDhruv KumarHortonWorksBig Data11:50 AMGC-130Connecting enterprise systems has always been a tough task. Modern IoT applications have exacerbated the issue by the need to integrate legacy systems with novel high velocity data streams. Various patterns like messaging, REST, etc. have been proposed, but they necessitate rearchitecting the integration layer which is extremely arduous. In this talk we will show you how to use Apache NiFi to solve your data integration, movement and ingestion problems. Next, we will examine how Apache NiFi can be used to construct durable, scalable and responsive IoT apps in conjunction with other stream processing and messaging frameworks.
5
Twitter Heron @ ScaleKarthik RamasamyTwitterBig Data1:30 PMLibrary 4th floorTwitter generates billions and billions of events per day. Analyzing these events in real time presents a massive challenge. Twitter designed and deployed a new streaming system called Heron. Heron has been in production nearly 2 years and is widely used by several teams for diverse use cases. This talk looks at Twitter's operating experiences and challenges of running Heron at scale and the approaches taken to solve those challenges.
6
Warner Bros. Digital Consumer Intelligence at ScaleBrian KursarWarner Bros.Big Data2:10 PMGC-160Warner Bros. processes billions of records each day Globally between its web assets, digital content distribution, OTT streaming services, online and mobile games, technical operations, anti-piracy programs, social media, and retail point of sale transactions. Combining these datasets with content metadata, Warner Bros. is able produce Consumer insights and affinity models that result in highly accurate Audience segments.
7
Rapid Analytics @ Netflix LA (Updated and Expanded)Chris StephensNetflix, IncBig Data2:50 PMFA-100This talk explores how Netflix equips its engineers with the freedom to find and introduce the right software for the job - even if it isn't used anywhere else in-house. Examples include how Netflix has enabled analysts to fluidly switch between MPP RDBMS and an auto-scaling Presto cluster, how Spark + NoSQL stores are used when deploying data sets to internal web apps, and how data scientists are enabled to work in the ML framework of their choosing and deploy models as a service.
8
Puree Through Trillions of Clicks in SecondsJag SrawanInteranaBig Data3:50 PMGC-130YesInterana is a full stack analytics solution that provides lightening fast querying capabilities using a proprietary storage format. Interana was designed to utilize best of both in-memory and disk architectures. This talk serves as an introduction to concepts on event data and utilizing advanced behavior analysis built into Interana. The attendee will gain knowledge about how to model their data effectively using our full service solution.
9
How To Use Impala and Kudu To Optimize Performance for Analytic WorkloadsDavid AlvesClouderaBig Data4:30 PMLibrary 4th floorThis session describes how Impala integrates with Kudu for analytic SQL queries on Hadoop and how this integration, taking full advantage of the distinct properties of Kudu, has significant performance benefits.
10
Apply R in Enterprise ApplicationsLouis Bajuk-YorganTIBCO Software Inc.Big Data5:10 PMGC-150Prototypes are typically re-implemented in another language due to compatibility issues with R in the enterprise, but TIBCO Enterprise Runtime for R (TERR) allows the language to be run on several platforms. Enterprise-level scalability has been brought to the R language, enabling rapid iteration without the need to recode, re-implement and test. This presentation will delve further into these topics, highlighting specific use cases and the true value that can be gained from utilizing R. The session will be followed by a lively, open Q&A discussion.
11
Fluentd and Embulk: Collect More Data, Grow FasterKazuki OhtaTreasure DataBig Data5:50 PMGC-130Since Doug Cutting invented Hadoop and Amazon Web Services released S3 ten years ago, we've seen quite a bit of innovation in large-scale data storage and processing. These innovations have enabled engineers to build data infrastructure at scale, many of them fail to fill their scalable systems with useful data, struggling to unify data silos or failing to collect logs from thousands of servers and millions of containers. Fluentd and Embulk are two projects that I've been involved to solve the unsexy yet critical problem of data collection and transport. In this talk, I will give an overview of Fluentd and Embulk and give a survey of how they are used at companies like Microsoft and Atlassian or in projects like Docker and Kubernetes.
12
Data Storytelling for ImpactDave GoodsmithDataScienceData Science10:30 AMGC-160How can our data make the biggest impact? How do we find the stories worth sharing buried in our analytics? How important are visuals, hooks, connections, content? As data science and journalism have co-evolved, the potential for effectively communicating with data has skyrocketed. We'll look at case studies of impactful data stories and share the process for developing data stories that drive action.
13
Decision Making and Lambda ArchitectureGirish KathalagiriSamsung SDS Research AmericaData Science11:10 AMGC-130Online decision making over time needs interacting with an ever changing environment and underlying machine learning models need to change and adapt to this changing environment. This talk discusses class of machine learning algorithms and provides details of how the computation is parallelized using the Spark framework.
14
Data Science + HollywoodConor DowlingNetflix, IncData Science11:50 AMFA-100YesNetflix will spend six billion dollars this year on content, making the company a major player in Hollywood. An increasing portion of this spend will be on original shows such as House of Cards, and original movies such as Beasts of No Nation. As we continue to expand our involvement with Hollywood, we want to leverage data and data science to make the best decisions possible. This talk will explore areas where we see the most opportunity to apply data science to Hollywood, and some early approaches we've taken.
15
Data Science + HollywoodTodd HollowayNetflix, IncData Science11:50 AMFA-100YesNetflix will spend six billion dollars this year on content, making the company a major player in Hollywood. An increasing portion of this spend will be on original shows such as House of Cards, and original movies such as Beasts of No Nation. As we continue to expand our involvement with Hollywood, we want to leverage data and data science to make the best decisions possible. This talk will explore areas where we see the most opportunity to apply data science to Hollywood, and some early approaches we've taken.
16
Enabling Cross-Screen Advertising with Machine Learning and SparkDebajyoti (Deb) RayVideoAmpData Science1:30 PMFA-100With content now viewed seamlessly across multiple screens, this shift in consumer behavior/consumption has come to a head with the way advertising is sold - separately in TV and online silos - creating an opportunity to make advertising more effective using data and machine learning. This talk explores technological developments at VideoAmp that bring together data from disparate mediums and creates cross-screen audience models using ML methods for cross-screen bid optimization, and graph based audience models for 150 Million users, across over a billion unique device IDs, as well as behavioral insights gleaned from observing such a large variety of data.
17
The right tool for the job: Guidelines for algorithm selection in predictive modelingDerek WilcoxZestFinanceData Science2:10 PMLibrary 4th floorThe goal of this talk to lay out a framework for what algorithms work best in which situations, and why. Drawing on results of hundreds of crowd-sourced predictive modeling contests, this talk shows examples of how structure informs a choice in algorithm. As an illustration of these concepts, ZestFinance's work with China's retail giant, JD.com is used to describe how the right algorithms were applied to the right datasets to turn shopping data into credit data -- creating credit scores from scratch.
18
Stream processing with R and Amazon KinesisGergely DarocziCARD.comData Science2:50 PMGC-160This talk presents an original R client library to interact with Amazon Kinesis via a simple daemon to start multiple R sessions on a machine or cluster of nodes to process data from theoretically any number of shards, and will also feature some demo micro-applications streaming dummy credit cards transactions, enriching this data and then triggering other data consumers for various business needs, such as scoring transactions, updating real-time dashboards and messaging customers. Besides the technical details, the motives behind choosing R and Kinesis will be also covered, including a quick overview on the related data infrastructure changes at CARD.
19
The Evolving Data Science LandscapeKyle PolichData Science3:50 PMFA-100YesThe impact of data science on business is undeniable, and the value it provides is growing without signs of slowing. To keep up with this rapidly evolving technology landscape, data scientists must adapt and specialize through continuous learning. This talk focuses on how they can do that in a way that maximizes the positive impact data science will have on their organization.
20
Affinity Marketing Leveraging Crowdsourced PsychographicsRavi IyerRankerData Science4:30 PMGC-130The most important variables to use to discover your best future customers are increasingly psychological. Borrowing techniques from psychometrics, this talk shows how marketers can use disparate online data sources to measure the right psychographic variables in order to maximize both performance and scale.
21
Intuit's Payments Risk PlatformBoris BelyiIntuitData Science5:10 PMFA-100This talk explores the path taken at Intuit, the maker of TurboTax, Mint and Quickbooks, to operationalize predictive analytics and highlights automations that have allowed Intuit to stay ahead of the fraud curve.
22
Intuit's Payments Risk PlatformDusan BosnjakovicIntuitData Science5:10 PMFA-100This talk explores the path taken at Intuit, the maker of TurboTax, Mint and Quickbooks, to operationalize predictive analytics and highlights automations that have allowed Intuit to stay ahead of the fraud curve.
23
Backstage to a Data Driven Culture: Your Data Science and Analytics StackPauline ChowGeneral AssemblyData Science5:50 PMGC-150When you're the first data professional at the organization there are technical, process, and qualitative considerations for analytics and data science to address (A/DS). This talk is an overview of strategy, infrastructure, and tools for creating your first A/DS stacks. At this stage, the range of problems that you are able to solve relate to organization, operational, data engineering, business intelligence, and communication. Creating the optimal A/DS stack can seamlessly pave the way to big data and integrating the newest technologies in the future.
24
Real-time Aggregations, Approximations, Similarities, and Recommendations at Scale using Spark Streaming, ML, GraphX, Kafka, Cassandra, Docker, CoreNLP, Word2Vec, LDA, and Twitter AlgebirdChris FreglyIBM Spark Technology CenterHadoop/Spark/Kafka10:30 AMFA-100Live, Interactive Recommendations Demo - Spark Streaming, ML, GraphX, Kafka, Cassandra, Docker, CoreNLP, Word2Vec, LDA, and Twitter Algebird (advancedspark.com). Types of Similarity - Euclidean vs. Non-Euclidean, Similarity, Jaccard Similarity, Cosine Similarity, LogLikelihood Similarity, Edit Distance. Text-based Similarities and Analytics - Word2Vec, LDA Topic Extraction, TextRank, Similarity-based Recommendations, User-to-User, Content-based, Item-to-Item (Amazon), Collaborative-based, User-to-Item (Netflix), Graph-based, Item-to-Item ""Pathways"" (Spotify). Aggregations, Approximations, and Similarities at Scale - Twitter Algebird, MinHash and Bucketing, Locality Sensitive Hashing (LSH), BloomFilters, CountMin Sketch, HyperLogLog
25
Iterative Spark Development at BloombergNimbus GoehausenBloombergHadoop/Spark/Kafka11:10 AMFA-100This presentation will explore how Bloomberg uses Spark, with its formidable computational model for distributed, high-performance analytics, to take this process to the next level, and look into one of the innovative practices the team is currently developing to increase efficiency: the introduction of a logical signature for datasets.
26
Alluxio (formerly Tachyon): An Open Source Memory Speed Virtual Distributed StorageGene PangAlluxioHadoop/Spark/Kafka11:50 AMLibrary 4th floorAlluxio, formerly Tachyon, is a memory speed virtual distributed storage system. The Alluxio open source community is one of the fastest growing open source communities in big data history with more than 300 developers from over 100 organizations around the world. In the past year, the Alluxio project experienced a tremendous improvement in performance and scalability and was extended with key new features including tiered storage, transparent naming, and unified namespace. Alluxio now supports a wide range of under storage systems, including Amazon S3, Google Cloud Storage, Gluster, Ceph, HDFS, NFS, and OpenStack Swift. This year, our goal is to make Alluxio accessible to an even wider set of users, through our focus on security, new language bindings, and further increased stability.
27
Data Provenance Support in SparkMatteo InterlandiUCLAHadoop/Spark/Kafka1:30 PMGC-150Debugging data processing logic in Data-Intensive Scalable Computing (DISC) systems is a difficult and time consuming effort. To aid this effort, we built Titian, a library that enables data provenance tracking data through transformations in Apache Spark.
28
Introduction to KafkaJesse AndersonSmoking HandHadoop/Spark/Kafka2:10 PMGC-130An introduction to what Kafka is, the concepts behind it and its API.
29
Building an Event-oriented Data PlatformEric SammerRocanaHadoop/Spark/Kafka2:50 PMLibrary 4th floorWhile we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes per day of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. This session is especially recommended for data infrastructure engineers and architects planning, building, or maintaining similar systems.
30
Panel - Interactive Applications on Spark?David LevingerPaxataHadoop/Spark/Kafka4:00 PMGC-150In this interactive panel discussion, you will hear from these Spark experts as to why they chose to go "all-in" on Spark, leveraging the rich core capabilities that make Spark so exciting, and committing to significant IP that turns Spark into a world-class enterprise data preparation engine. Raymond and David will explain specific cases where capabilities were built on top of core Spark to provide a true interactive data prep application experience. Innovations such as creating a Domain Specific Language (DSL), an optimizing compiler, a persistent columnar caching layer, application specific Resilient Distributed Datasets (RDDs), on-line aggregation operators to solve the core memory, pipelining and shuffling obstacles to produce a highly interactive application with the core user and data volume scale-out benefits of Spark.
31
Panel - Interactive Applications on Spark?Raj BabuAgileISSHadoop/Spark/Kafka4:00 PMGC-150In this interactive panel discussion, you will hear from these Spark experts as to why they chose to go "all-in" on Spark, leveraging the rich core capabilities that make Spark so exciting, and committing to significant IP that turns Spark into a world-class enterprise data preparation engine. Raymond and David will explain specific cases where capabilities were built on top of core Spark to provide a true interactive data prep application experience. Innovations such as creating a Domain Specific Language (DSL), an optimizing compiler, a persistent columnar caching layer, application specific Resilient Distributed Datasets (RDDs), on-line aggregation operators to solve the core memory, pipelining and shuffling obstacles to produce a highly interactive application with the core user and data volume scale-out benefits of Spark.
32
Panel - Interactive Applications on Spark?Raymond FuTrace3Hadoop/Spark/Kafka4:00 PMGC-150In this interactive panel discussion, you will hear from these Spark experts as to why they chose to go "all-in" on Spark, leveraging the rich core capabilities that make Spark so exciting, and committing to significant IP that turns Spark into a world-class enterprise data preparation engine. Raymond and David will explain specific cases where capabilities were built on top of core Spark to provide a true interactive data prep application experience. Innovations such as creating a Domain Specific Language (DSL), an optimizing compiler, a persistent columnar caching layer, application specific Resilient Distributed Datasets (RDDs), on-line aggregation operators to solve the core memory, pipelining and shuffling obstacles to produce a highly interactive application with the core user and data volume scale-out benefits of Spark.
33
Why is my Hadoop cluster slow?Bikas SahaHortonworksHadoop/Spark/Kafka5:10 PMGC-160This talk draws on our experience in debugging and analyzing Hadoop jobs to describe some methodical approaches to this and present current and new tracing and tooling ideas that can help semi-automate parts of this difficult problem.
34
Deep Learning at ScaleAlexander KernPavlovHadoop/Spark/Kafka5:50 PMFA-100The advent of modern deep learning techniques has given organizations new tools to understand, query, and structure their data. However, maintaining complex pipelines, versioning models, and tracking accuracy regressions over time remain ongoing struggles of even the most advanced data engineering teams. This talk presents a simple architecture for deploying machine learning at scale and offer suggestions for how companies can get their feet wet with open source technologies they already deploy.
35
Real Life IoT ArchitectureDinesh SrirangpatnaMicrosoftNoSQL10:30 AMGC-130Learn how to benefit from IoT (internet of things) to reduce costs and spur transformation for your company and clients. Attendees will learn about building blocks to create an IoT solution, and walk through real life architectural decisions in building a solution.
36
Amazon DynamoDB - Focus on Your Data and Leave Ops to Someone ElseMichael LimcacoAmazon Web Services (AWS)NoSQL11:10 AMGC-150This talk explores features and benefits of Amazon DynamoDB, a fully managed NoSQL database service, in detail, and discusses how to get the most out of DynamoDB, in addition to design best practices with DynamoDB across multiple use cases.
37
Spark And Couchbase: Augmenting The Operational Database With SparkMatt IngenthronCouchbaseNoSQL11:50 AMGC-160YesFor an operational database, Spark is like Batman’s utility belt: it handles a variety of important tasks from data cleanup and migration to analytics and machine learning that make the operational database much more powerful than it would be on its own. In this talk, we describe the Couchbase Spark Connector that lets you easily integrate Spark with Couchbase Server, an open source distributed NoSQL document database that provides low latency data management for large scale, interactive online applications. We’ll start with common use cases for Spark and Couchbase, then cover the basics of creating, persisting and consume RDDs and DataFrames from Couchbase’s key/value and SQL interfaces.
38
Using Redis Data Structures to Make Your App Blazing FastAdi FoulgerRedis LabsNoSQL1:30 PMGC-130Open Source Redis is not only the fastest NoSQL database but also the most popular among the new wave of databases running in containers. This talk introduces the data structures used to speed up applications and solve the everyday use cases that are driving Redis' popularity.
39
Apache Kudu: Fast Analytics on Fast DataDan BurkertClouderaNoSQL2:10 PMGC-150Apache Kudu (incubating) is a new storage engine for the Hadoop ecosystem that enables extremely high-speed analytics without imposing data-visibility latencies. This talk provides an introduction to Kudu, and provides an overview of how, when, and why practitioners use Kudu as a platform for building analytics solutions.
40
Big Data and Real Estate.Anton PolishkoZullooNoSQL2:50 PMGC-150The real estate industry is generating terabytes of data, but a very small percentage is being utilized or processed. ZULLOO Inc. is creating a artificial intelligence engine utilizing big data and machine learning. The question is, why aren't more data scientists exploring the real estate industry when it represents 15% of the US GDP, measured in the Trillions?
41
Big Data and Real Estate.Jon ZifcakZullooNoSQL2:50 PMGC-150The real estate industry is generating terabytes of data, but a very small percentage is being utilized or processed. ZULLOO Inc. is creating a artificial intelligence engine utilizing big data and machine learning. The question is, why aren't more data scientists exploring the real estate industry when it represents 15% of the US GDP, measured in the Trillions?
42
Analytics at the Speed of Light with Redis and SparkDave NeilsenRedis LabsNoSQL3:50 PMLibrary 4th floorYesSpark is in-memory, Redis is in-memery. The Spark-Redis connector gives Spark access to Redis' data structures as RDDs. Redis, with its blazing fast performance and optimized in-memory data structures, reduces Spark processing time by up to 98%. In this talk, Dave will share the top use cases for Spark-Redis such as time-series, recommendations and real-time bid management.
43
Introduction to Graph DatabasesOren GolanSanguineNoSQL4:30 PMFA-100Many organizations have adopted graph databases - IoT, health care, financial services, telecommunications and governments. This talk, based on our research and implementation of a graph database at Sanguine, a startup based in LA, dives into a few use cases and equips attendees with everything they need to start using a graph database.
44
MongoDB 3.2 Goodness!!!Mark HelmstetterMongoDBNoSQL5:10 PMGC-130This talk explores the new features of MongoDB 3.2 such as $lookup, document validation rules, encryption-at-rest and tools like the BI Connector, OpsManager 2.0 and Compass.
45
Privacy vs. Security in a Big Data WorldTamara DullSAS InstituteNoSQL5:50 PMLibrary 4th floorThe jury is still out on whether Edward Snowden was a hero, traitor, or schmuck. Regardless of the scarlet letter we want to hang around his neck, we should thank him for helping bring the discussion of big data privacy and security to the public square. This session examines the issues of big data privacy and security in the context of the six-stage (big) data lifecycle: create, store, use, share, archive, and destroy.
46
Reliable Media Reporting in an Ever-changing Data LandscapeEric AvilaNBCU & OnPrem Solution PartnersUse Case Driven10:30 AMLibrary 4th floorOnPrem Solution Partners worked with NBCU to profile in-house data to determine data quality, and recommend process and quality improvements. We present our process for data import, improvements we want to make, and lessons learned regarding various tools used, including MariaDB, ElasticSearch, Cassandra, and others.
47
Reliable Media Reporting in an Ever-changing Data LandscapeJosh AndrewsNBCU & OnPrem Solution PartnersUse Case Driven10:30 AMLibrary 4th floorOnPrem Solution Partners worked with NBCU to profile in-house data to determine data quality, and recommend process and quality improvements. We present our process for data import, improvements we want to make, and lessons learned regarding various tools used, including MariaDB, ElasticSearch, Cassandra, and others.
48
Reliable Media Reporting in an Ever-changing Data LandscapeRachel KelleyNBCU & OnPrem Solution PartnersUse Case Driven10:30 AMLibrary 4th floorOnPrem Solution Partners worked with NBCU to profile in-house data to determine data quality, and recommend process and quality improvements. We present our process for data import, improvements we want to make, and lessons learned regarding various tools used, including MariaDB, ElasticSearch, Cassandra, and others.
49
The Encyclopedia of World ProblemsChristine ZhangKnight-Mozilla @ LA TimesUse Case Driven11:10 AMGC-160Born more than four decades ago from the partnership of two international NGOs in Brussels, the Encyclopedia of World Problems has hand-picked and refined profiles of tens of thousands of problems occurring around the world: from notorious global issues all the way down to very specific and peculiar ones. This talk presents an overview of the Encyclopedia and the interesting data science applications that have arisen from the Encyclopedia's body of work - notably, its database resources.
50
BI is brokenDave FryerDomoUse Case Driven11:50 AMGC-150YesNot all BI solutions are created equal. The problem in most organizations is that disparate systems hold data hostage. Most systems create barriers between the data and the people who need the data to make decisions. We create silos of data that do not give us a holistic view of how the organization is operating. Domo is breaking down these silos and giving business users unparalleled access to the data they need to optimize their business.
51
Dealing with Data Discomfort: Getting Bureaucrats to Embrace Data and AnalyticsJuan VasquezMayor's Operations Innovation Team at City of Los AngelesUse Case Driven1:30 PMGC-160Government is traditionally known for red tape, stuffy hierarchies, endless policies, and clashing priorities. These and other variables make it difficult for government entities to embrace change and innovation, and more importantly apprehensive about peeling back the layers and letting data tell the stories.
So how do you change that? In this talk we'll discuss how the Mayor's Operations Innovation Team is leveraging storytelling, education, public-private partnerships, and data visualization technologies to help LA embrace data.
52
Data and Hollywood: "Je t'Aime ... Moi Non Plus"Yves BergquistUSC - Entertainment Technology CenteUse Case Driven2:10 PMFA-100Application of machine learning to problems such as script and story analysis, audience segmentation, and security, is revolutionizing the way Hollywood is creating and marketing entertainment.
53
Hydrator: Open Source, Code-Free Data PipelinesJon GrayCask DataUse Case Driven2:50 PMGC-130This talk will present how to build data pipelines with no code using the open-source, Apache 2.0, Cask Hydrator. The talk will continue with a live demonstration of creating data pipelines for two use cases.
54
From Clusters to Clouds, Hardware Still MattersEric LesserPSSC LabsUse Case Driven3:50 PMGC-160YesToday’s Software Defined environments attempt to remove the weakness of computing hardware from the operational equation. There is no doubt that this is a natural progress away from overpriced, proprietary compute and storage layers. However, even at the heart of any Software Defined universe is an underlying hardware stack that must be robust, reliable and cost effective. Our 20+ years experience delivering over 2000 clusters and clouds has taught us how to properly design and engineer the right hardware solution for Big Data, Cluster and Cloud environments. This presentation will share this knowledge allowing user to make better design decisions for any deployment.
55
How to Use Design Thinking to Jumpstart Your Big Data ProjectsPeter RealeDatameerUse Case Driven4:30 PMGC-160There is a novel approach to identifying big data use cases, one which will ultimately lower the barrier to entry to big data projects and increase overall implementation success. This talk describes the approach used by big data pioneer and Datameer CEO Stefan Groschupf to drive over 200 production implementations.
56
Shaping the Role of Data Science: An Evolution towards Prescriptive Analytics as Key Driver in Revenue AccelerationThomas SullivanIRIS.TVUse Case Driven5:10 PMLibrary 4th floorAt IRIS.TV, our business builds algorithmic solutions for video recommendation with the end goal to deliver a great user experience as evidenced by users viewing more video content. This talk outlines our reasons for expanding from a descriptive/predictive approach to data analytics toward a philosophy that features more prescriptive analytics, driven by our data science team.
57
Raising Venture Capital for Data Driven StartupsAustin ClementsTenOneTen VenturesUse Case Driven5:50 PMGC-160Get an inside look into how VCs evaluate your team, market, and product before making an investment decision. Learn how to identify the right investors for your business and how to stand out from the crowd.
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...