Skip to content

Agendas of past and upcoming iterations of the DataConf data science conference and related resources.

Notifications You must be signed in to change notification settings

DataHackIL/DataConf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DataConf

DataConf is a casual data science conference by DataHack.

This is a list of agendas and resources of past and upcoming iterations of the conference.

You can find us on our website, Facebook, Meetup, Twitter and join our monthly newsletter.


Agendas:


Facebook event page: https://www.facebook.com/events/555816034629213

Meetup event page: https://www.meetup.com/preview/DataHack/events/234096133




Speaker: Adi Nesher, PayPal

Title: Defining the right label: How to create a valuable population tag in situations of uncertainty

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2016/Smart_Labeling_of_Machine_Learning_Problems.pdf




Speakers: Amalia Bryl & Shahar Wilner, EDvantage

Title: Can machine learning empower education?

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2016/Edvantage-Machine_learning_and_edtech.pdf




Held on Thursday, October 26th, between 09:00 and 18:00, DataConf 2017 drew a crowd of over 100 data science and machine learning experts from the top companies in Israel for a day of knowledge sharing.

Event website: http://dataconf.org/

Meetup event page: https://www.meetup.com/DataHack/events/244004618/

Facebook event page: https://www.facebook.com/events/1623405514382356/


Speaker: Yakov Shambik, Vehicles Detection Technology Manager @ Mobileye

Title: Eye of the Beholder: Object detection in Mobileye using Deep Neural Networks and other techniques

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Mobileye_Yakov_Shambik.pdf


Speaker: Ofer Ron, Head of Data Science @ LivePerson

Title: Concepts before machinery: Harnessing the power of domain expertise for machine-learning-based solutions

Video: https://www.youtube.com/watch?v=wR2u7V8D5Y8&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=3

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_LivePerson_Ofer_Ron.pdf


Speaker: Alex Ran, Distinguished Engineer @ Intuit

Title: Using Data Science for Automated Accounting

Video: https://www.youtube.com/watch?v=_ZBos8T35D0&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=2

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Intuit_Alex_Ran.pdf


Speaker: Meir Maor, Chief Architect @ SparkBeyond

Title: Developing Simple and Stable Machine Learning Models

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_SparkBeyond_Meir_Maor.pdf


Speaker: Roii Spoliansky, Lead Data Scientist @ PayPal

Title: Active learning optimization as a function of label cost and mistake cost

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_PayPal_Roii_Spoliansky.pdf


Speaker: Gil Chamiel, Director of Data Science and Algorithms @ Taboola

Title: Don’t believe everything your network tells you: Uncertainty in deep learning for recommender systems

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Taboola_Gil_Chamiel.pdf


Speaker: Adina Lederhendler, Senior Data Scientist @ Neura

Title: General vs. subpopulation-specific modeling: When and why you need to get specific

Video: https://www.youtube.com/watch?v=ft36Tq5FUz0&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=4

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Neura_Adina_Lederhandler.pdf


Speaker: Yonatan Wexler, VP R&D @ Orcam

Title: Fast and Furious Face Recognition: Efficient metric learning for video stream data

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Orcam_Yonatan_Wexler.pdf


Speaker: Itamar Ben-Ari, Research Scientist @ Intel

Title: Differentiable Memory Allocation Mechanism For Neural Computing

Video: https://www.youtube.com/watch?v=DAHTNElXXgk&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=4

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Intel_Itamar_Ben_Ari.pdf


Speaker: Dr. Oshri, Senior Research Scientist @ Rafael

Title: Multi-agent deep reinforcement learning in communication networks

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Rafael.pdf


Held on Thursday, October 4th, between 09:00 and 18:00, DataConf 2018 drew a crowd of over a 100 data science and machine learning experts from the top companies in Israel for a day of knowledge sharing.

YouTube playlist: https://www.youtube.com/playlist?list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Event website: http://dataconf.org/

Meetup event page: https://www.meetup.com/DataHack/events/255082526/

Facebook event page: https://www.facebook.com/events/1967922793269453/


Speaker: Dana Kaner, Data Scientist @ Perimeter X

Title: Bootstrap, Random Forest and All Sorts of Magic

Video: https://www.youtube.com/watch?v=ynkJVd6B13U

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_PerimeterX_Dana_Kaner.pdf

Abstract: The Bootstrap resampling method is often used for statistical inference. We demonstrate its power and simplicity through the well known Random Forest algorithm. We present both the theoretical background on the above topics and an implementation in R.


Speaker: Pavel Levin, Senior Data Scientist @ Booking.com

Title: Where should I travel next? Modeling multi-destination trips with Recurrent Neural Networks.

Video: https://www.youtube.com/watch?v=pwfwUA4ZShI&t=0s&index=5&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Booking_Pavel_Levin.pdf

Abstract: Many real-world problems naturally give rise to sequential data. Language models are already widely used to tackle computational problems related to natural language. We would like to present a non-NLP example by walking through a solution to the problem of recommending next destinations to customers who are taking a single trip to multiple cities using RNN-based sequence modeling.


Speaker: Ari Bornstien, Senior Cloud Developer Advocate @ Microsoft

Title: Beyond Word Embeddings

Video: https://www.youtube.com/watch?v=zeYwMIDo05w&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=6

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Microsoft_Ari_Bornstein.pdf

Abstract: Since the advent of word2vec, word embeddings have become a go to method for encapsulating distributional semantics in NLP applications. This presentation will review the strengths and weaknesses of using pre-trained word embeddings, and demonstrate how to incorporate more complex semantic representation schemes such as Semantic Role Labeling, Abstract Meaning Representation and Semantic Dependency Parsing in to your applications.


Speaker: Dr. Michal Shmueli-Scheuer, Researcher @ IBM Research

Title: Conversational Bots for Customer Support

Video: https://www.youtube.com/watch?v=i567nLfEGYs&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=9

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_IBM_Michal_Shmueli_Scheuer.pdf

Abstract: In this talk, I'll cover various aspects of conversational bots, focusing on the domain of customer support. Often, human conversations with bots mimic the way humans interact with each other. Moreover, even when customers know that they are interacting with virtual agents (bots), they still expect them to behave like humans. One way to improve interactions with bots is by giving them some human characteristics ,such as emotion and personality. I'll show how a model of neural response generation can be used to generate bot responses according to a target personality. I'll then cover a methodology for detecting egregious conversations in a setting using conversational bots by examining behavioral cues from the customer, patterns in the agents’ responses, and customer-agent interactions.


Speaker: Nofar Betzalel, Data Scientist @ Paypal

Title: Semi-Supervised Learning – to extend our Tagging Coverage

Video: https://www.youtube.com/watch?v=c4-3697xwys&index=7&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&t=0s

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_PayPal_Nofar_Betzalel.pdf

Abstract: When PayPal's risk decision making processes approve a transaction, we soon know whether it was the right decision. However, for declined transactions this is not the case, as our tagging coverage is not complete. This makes it more challenging for analysts and data scientists to understand our False-Positives when performing research and when measuring our decision making processes. In this talk I will discuss how we use Semi-Supervised learning to tag declined transactions as ones that would have been fraudulent or not, if were approved. This approach enables us to utilize both tagged and non-tagged transactions to train a model for the issued task.


Speaker: Dr. Lev Faivishevsky, Researcher @ Intel Advanced Analytics

Title: Using Deep Learning to Detect Video Distortions

Video: https://www.youtube.com/watch?v=FhMWZgs0kJ8&t=0s&index=8&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Intel_Lev_Faivishevsky.pdf

Abstract: Since the acquisition of Mobileye, it became common knowledge that Intel is interested in building AI-based products and producing hardware for AI applications. A less widely known role of AI at Intel is an internal role, using the huge and diverse data related to Intel's own operations to transform the way the company works and create a large value. Processor design, manufacturing and sales are leveraging machine-learning methods, including computer-vision, natural language processing and reinforcement learning techniques. The talk will start with a little background about these applications, and focus on one deep-learning based video analytics solution, used in the context of the processor validation. We will describe this non-standard use-case and the challenges in resolving it, most of which are also relevant for other use-cases in the domain, including handling scarcity of labeled data and coping with tight requirements in terms of both accuracy and run-time.


Speaker: Prof. Danny Pfeffermann, National Statistician of Israel @ Central Bureau for Statistics

Title: Can Big Data Really Replace Traditional Surveys for theProduction of Official Statistics

Video: https://www.youtube.com/watch?v=OcD20PkNj-w&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=10

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Lamas_Danny_Pfeffermann.pdf

Abstract: The big advancements in technology, which enable to access and analyse 'big data', coupled with increased demand for more accurate, more detailed and more timely official data, but with tightened available budgets, puts inevitable pressure on producers of official statistics to replace traditional sample surveys by big data sources. In the first part of my presentation I shall discuss some of the major challenges in the use of big data for official statistics, pointing out their advantages and limitations. In the second part I shall consider a general class of statistical models, which can possibly link the big data under consideration to the corresponding target, finite population data. The use of a model in the class may allow estimating finite population parameters, without the need for reference samples or administrative files.


Speaker: Avi Hendler-Bloom, Algorithms Developer @ MobilEye

Title: Overcoming the Electronic Traffic Sign Problem

Video: https://www.youtube.com/watch?v=QN9gfUZUqDU

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Mobileye_Avi_Hendler_Bloom.pdf

Abstract: Electronic traffic signs are commonly made with LEDs. Due to the differences in frequency and phase between each LED light, classifying this type of sign is challenging.This talk will address the issues faced, and introduce a solution.


Speaker: Daniel Benzaquen, Data Scientist @ Lightricks

Title: AB testing at Scale

Video: https://www.youtube.com/watch?v=-k1X2MRgGlY

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Lightricks_Daniel_Benzaquen.pdf

Abstract: Deep Learning have been gaining increasing attention in the recommendation systems community, replacing some of the traditional methods. In this talk, we will share some lessons we learned from using deep learning at huge scale in Taboola's recommendation system. Specifically, we will talk about the motivation for using deep learning and the tradeoffs between deep models and simpler models. We will discuss our approach to building neural networks with multiple input types (numerical, categorical, text, and images); capturing non trivial interactions between features using both deep dense architectures and Factorization Machine models; Tradeoffs between memorization and generalization and other tips regarding network architectures.


Speaker: Gil Chamiel, Director of Algorithms and Data Science @ Taboola

Title: Deep And Shallow Learning in Recommendation Systems

Video: https://www.youtube.com/watch?v=nghXG5OiUno&index=12&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Taboola_Gil_Chamiel.pdf

Abstract: A/B testing is a central statistical procedure used frequently by data-scientists. Unfortunately, the standard A/B testing framework was originally designed to cope with a handful number of tests, while these days, conducting tens and even hundreds of tests, simultaneously, is a common scenario.

Directly applying the standard procedure, however, is highly problematic as many tests imply many false-discoveries, that potentially lead to sub-optimal performances. With the goal of controlling the false-discovery-rate, several procedures were designed: probably the most naive one is Bonferroni correction; More advanced schemes are Fisher's least-significant-difference, Benjamini-Hochberg etc. Yet, utilizing these schemes comes with the price of high False-negative rate that scales with the number of tests being conducted.

In this talk we discuss our attempt to bypass these challenges by utilizing a Bayesian Multi-Armed-Bandit approach, namely, Thompson-Sampling (TS) that operates in an online-learning manner. We share our experience and insights based on simulations and real-life experiments.

Finally, we discuss some generalizations of the standard TS scheme we made, that allow us to optimize over (non-trivial) statistical quantities (i.e., unnecessarily the conversion-rate/click-through-rate, which are of obvious interest, but users Life-Time-Value (LTV) etc).


Speaker: Oren Shamir, Head of CV Algorithm Development @ Innoviz Technologies

Title: Neural networks for point clouds: Adding the 3rd Dimension

Video: https://www.youtube.com/watch?v=aE3mfLm5dMA&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=11

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Booking_Innoviz_Shamir.pdf

Abstract: Since Alexnet, DNNs have been used with rapidly increasing success to perform a wide variety of tasks on 2D images. This is the result of increased data availability, increased effective processing power, as well as incremental algorithmic improvements. Today, DNNs achieve super-human results on multiple tasks in the 2D data domain.

Processing of 3D data using DNNs has been studied less during that time. 3D sensors are less abundant, and are more variable in their capabilities and properties. In the past few years various methods for processing of 3D data have emerged, driven mainly by the medical imaging industry and, more recently, the autonomous car industry. 3D data may be unstructured, sparse and irregular, yielding unique challenges relative to 2D image data.

In this talk I will discuss the challenges of working with 3D data, and present an overview of approaches towards 3D data processing in DNNs.

About

Agendas of past and upcoming iterations of the DataConf data science conference and related resources.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published