Home
Search results “Mining of massive datasets jure leskovec”
Lecture 27 — Solving the BIGCLAM | Mining of Massive Datasets | Stanford University
 
09:20
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Lecture 87 — Exploiting Length (Advanced) | Stanford University
 
14:40
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Lecture 91 — Hubs and Authorities (Advanced) | Stanford University
 
15:17
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Jure Leskovec, Stanford - Stanford Medicine Big Data | Precision Health 2017
 
13:19
Jure Leskovec, PhD Bringing together thought leaders in large-scale data analysis and technology to transform the way we diagnose, treat and prevent disease. Visit our website at http://bigdata.stanford.edu/.
Views: 1664 Stanford Medicine
PageRank Python
 
07:16
The powerpoint and data are from the CS246 Mining Massive Data Sets course at Stanford University taught by professor Jure Leskovec.
Views: 1987 Yuan X
DOE CSGF 2012: Discovering Knowledge from Massive Networks and Science Data
 
40:32
View more information on the DOE CSGF Program at http://www.krellinst.org/csgf Alok Choudhary John G. Searle Professor of Electrical Engineering and Computer Science, Northwestern University Knowledge discovery in science and engineering has been driven by theory, experiments and more recently by large-scale simulations suing high-performance computers. Modern experiments and simulations involving satellites, telescopes, high-throughput instruments, imaging devices, sensor networks, accelerators, and supercomputers yield massive amounts of data. At the same time, the world, including social communities is creating massive amounts of data at an astonishing pace. Just consider Facebook, Google, Articles, Papers, Images, Videos and others. But, even more complex is the network that connects the creators of data. There is knowledge to be discovered in both. This represents a significant and interesting challenge for HPC and opens opportunities for accelerating knowledge discovery. In this talk, followed by an introduction to high-end data mining and the basic knowledge discovery paradigm, we present the process, challenges and potential for this approach. We will present many case examples, results and future directions including (1) Discovering knowledge from massive datasets from science applications including climate and medicine; (2) Real-time stream mining of text from millions of and tweets to identify influencers and sentiments of people; (3) Discovering knowledge from massive social networks containing millions of nodes and hundreds of billions of edges from real world Facebook, twitter and other social network data and (4) predicting structures from Simulation data. The talk will be illustrative and example driven and may include 1-2 live demonstrations.
Views: 123 Krell Institute
Mining Online Data Across Social Networks
 
01:04:14
Capturing Data, Modeling Patterns, Predicting Behavior. Capturing Data, Modeling Patterns, Predicting Behavior - Based on collecting more than 20 million blog posts and news media articles per day, Professor Jure Leskovec discusses how to mine such data to capture and model temporal patterns in the news over a daily time-scale --in particular, the succession of story lines that evolve and compete for attention. He discusses models to quantify the influence of individual media sites on the popularity of news stories and algorithms for inferring hidden networks of information flow. Learn more: http://scpd.stanford.edu/
Views: 20482 stanfordonline
Local Higher-Order Graph Clustering
 
03:01
Local Higher-Order Graph Clustering Hao Yin (Stanford University) Austin R. Benson (Stanford University) Jure Leskovec (Stanford University) David F. Gleich (Purdue University) Local graph clustering methods aim to find a cluster of nodes by exploring a small region of the graph. These methods are attractive because they enable targeted clustering around a given seed node and are faster than traditional global graph clustering methods because their runtime does not depend on the size of the input graph. However, current local graph partitioning methods are not designed to account for the higher-order structures crucial to the network, nor can they effectively handle directed networks. Here we introduce a new class of local graph clustering methods that address these issues by incorporating higher-order network information captured by small subgraphs, also called network motifs. First, we show how to adapt the approximate personalized PageRank algorithm to find clusters containing a seed node with minimal motif conductance, a generalization of the conductance metric for network motifs. We also generalize existing theory to maintain the properties of fast running time (independent of the size of the graph) and cluster quality (in terms of motif conductance). For community detection tasks on both synthetic and real-world networks, our new framework outperforms the current edge-based personalized PageRank methodology. Second, we develop a theory of node neighborhoods for finding sets that have small motif conductance, where the motif is a clique. We apply these results to the case of finding good seed nodes to use as input to the personalized PageRank algorithm. More on http://www.kdd.org/kdd2017/.
Views: 769 KDD2017 video
Convolutional Networks with Adaptive Computation Graphs
 
17:18
Presentation O-1A-01 of European Conference on Computer Vision 2018, Munich Germany Webpage: https://eccv2018.org Title: Convolutional Networks with Adaptive Computation Graphs Date: Monday, September 10, 2018 Speaker: Andreas Veit Authors: Andreas Veit*, Cornell University; Serge Belongie, Cornell University
Data Science - AntenaDev #02
 
01:29:08
Data Science - AntenaDev #02 Publicado em 04/12/2017 Antenados e Antenadas! Bem vindos ao segundo episódio da trilha principal do Canal AntenaDev!!!! Apresentamos neste episódio uma lendária entrevista com três renomados Cientistas de Dados. Conheça todas os detalhes dessa área que vem revolucionando a tecnologia e a vida dos seres humanos. Prepare-se!!! Sua vida já está sendo transformada pela Ciência de Dados (ou Data Science, em inglês). Descubra o que é Data Science, seus principais conceitos, suas aplicações, as oportunidades que estão surgindo, como o mundo está sendo transformado por ela e como se tornar um Cientistas dos Dados. Participantes: Hosts: - Prof. José Maria Monteiro, um host que tenta ser politicamente correto. Mas, só tenta mesmo. - Prof. Marcelo Gonçalves, o DevMan que ficou sem energia elétrica... Entrevistados: - Aderson Olivera - http://aderson.com - Igo Brilhante - https://www.wanderpaths.com/ - Tales Matos - www.arida.ufc.b - Nauber Gois - innovanti.io/workshopdatascience Links do episódio: - Workshop de Data Science: innovanti.io/workshopdatascience - Kaggle: www kaggle.com - Coursera: www.coursera.com - Certificação de DataScience (CAP): https://www.certifiedanalytics.org/ - Wanderpaths: https://www.wanderpaths.com - Wanderpaths no GooglePlay: https://play.google.com/store/apps/details?id=com.wanderpaths.app - Wanderpaths no Apple Store: https://itunes.apple.com/br/app/wanderpaths/id1147166365?mt=8 - Wanderpaths no Instagram: @wanderpathsapp - Wanderpaths no Facebook: @wanderpaths - Canal do Nauber: https://www.youtube.com/channel/UCZctB98Gn7af-3OpGC-7-Sg - Como encontrar o Igo Brilhante: @igobrilhante - Visualização de Dados: https://uber.github.io/deck.gl/#/ https://d3js.org/ - Livros sobre Data Science: https://www.amazon.com/Data-Mining-Concepts-Techniques-Management/dp/0123814790 https://www.amazon.com/Mining-Social-Web-Facebook-LinkedIn/dp/1449367615/ref=sr_1_1?s=home-garden&ie=UTF8&qid=1512041753&sr=8-1&keywords=mining+web+python https://www.amazon.com/Mining-Massive-Datasets-Jure-Leskovec/dp/1107077230/ref=sr_1_1?ie=UTF8&qid=1512041778&sr=8-1&keywords=mining+of+massive+datasets - Datasets Interessantes: https://elitedatascience.com/datasets - Plataformas https://databricks.com/ https://databricks.com/blog/2016/01/25/deep-learning-with-apache-spark-and-tensorflow.html https://hortonworks.com/ https://www.cloudera.com/ - Diary of a data scientist at Booking.com (Sou fã da Booking) https://towardsdatascience.com/diary-of-a-data-scientist-at-booking-com-924734c71417 Produção e conteúdo: • AntenaDev Inovação Edição e sonorização:  • AntenaDev Inovação Redes Sociais: • Blog: www.antenadev.com.br • Facebook: www.facebook.com/AntenaDev • Instagram: @AntenaDev – http://www.instagram.com/AntenaDev • Twitter: @AntenaDev – www.twitter.com/AntenaDev Agradecimentos: • https://www.freesound.org/ • https://www.pond5.com • http://br.freepik.com/ • https://pixabay.com/ • https://www.pexels.com/ • http://audiomicro.com/royalty-free-music Categoria • Educação Licença • Licença padrão do YouTube
Views: 202 AntenaDev
Lecture 29 — What Makes a Good Cluster (Advanced) | Stanford University
 
08:49
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis, 2/22/2010
 
01:14:52
SF Bay Area ACM Data Mining SIG http://www.sfbayacm.org/?p=1265 Location: LinkedIn, 2027 Stierlin Ct., Mountain View, CA 94043. Notice: NEW MEETING LOCATION for 2010 Date: Monday Feb 22, 2010; 6:30 pm Cost: Free and open to all who wish to attend, but membership is only $20/year. Anyone may join our mailing list at no charge, and receive announcements of upcoming events. Speaker: Michael W. Mahoney, Stanford University TITLE: "Algorithmic and Statistical Perspectives on Large-Scale Data Analysis" DESCRIPTION: Computer scientists and statisticians have historically adopted quite different views on data and thus on data analysis. In recent years, however, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are also useful in practice for solving large-scale scientific and Internet data analysis problems. After reviewing these two complementary perspectives on data, I will describe two recent examples of improved algorithms that used ideas from both areas in novel ways. The first example has to do with improved methods for structure identification from large-scale DNA SNP data, a problem which can be viewed as trying to find good columns or features from a large data matrix. The second example has to do with selecting good clusters or communities from a data graph, or demonstrating that there are none, a problem that has wide application in the analysis of social and information networks. Understanding how statistical ideas are useful for obtaining improved algorithms in these two applications may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale scientific and Internet data analysis problems more generally. SPEAKER BIOGRAPHY Dr. Mahoney is currently at Stanford University. His research interests focus on theoretical and applied aspects of algorithms for large-scale data problems in scientific and Internet applications. Currently, he is working on geometric network analysis; developing approximate computation and regularization methods for large informatics graphs; and applications to community detection, clustering, and information dynamics in large social and information networks. In the past, he has worked on randomized matrix algorithms and applications in genetics and medical imaging. He has been a faculty member at Yale University and a researcher at Yahoo Research, and his PhD was is computational statistical mechanics at Yale University. See also http://cs.stanford.edu/people/mmahoney/ Also he is involved in running the MMDS 2010 meeting on June 15-18, 2010. See details up at the web page http://mmds.stanford.edu/ soon, or details of prior year's Workshop on Algorithms for Modern Massive Data Sets. Michael Mahoney
Views: 2400 San Francisco Bay ACM
Lecture 57 — Extension to Include Global Effects (Advanced) | Stanford University
 
09:43
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Graph Mining and Analysis Lecture_10
 
53:23
Graph Mining and Analysis Lecture_10 21 December 201 5
Graph Mining with Deep Learning: challenges and pitfalls - Ana Paula Appel
 
30:30
Deep learning is widely use in several cases with a good match and accuracy. But when it comes to social networks there is a lot of problems involved, for example how do we represent a network in a neural network without lost node correspondence? Here I will review the state of art and present the success and fails in the area and the perspectives. -- Ana Paula Appel is a researcher, data scientist and master inventor at IBM Research Brazil. She analyzes large data volumes, specially data mapped in complex networks and was elected master inventor in 2016 with more than 25 patents registered at the USPTO. A member of IBM Academy of Technology, Ana Paula held her master and doctorate in Computer Science at the ICMC-USP. Acesse o conteúdo completo em: https://goo.gl/1HnQez
Views: 84 InfoQ Brasil
Lecture 26 — From AGM to BIGCLAM | Stanford University
 
08:49
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Lecture 34 — Spectral Clustering  Three Steps (Advanced) | Stanford University
 
07:18
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Lecture 53 — Discussion of the CUR Method | Stanford University
 
07:10
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
S8E18o: Overlapping communities
 
10:01
Season 8, Episode 18o Tuesday, 2018-03-29 Overlapping communities Big issue with structure detection methods that break systems into distinct modeuls: Nodes can easily belong to more than one community (think of people in social networks for example). We cover an early example that strongly frames and attempts to contend with this issue. Overlapping hBccores is the plan. Complex Networks, Spring 2018 Course website: http://www.uvm.edu/pdodds/teaching/courses/2018-01UVM-303/ Tweetage: https://www.twitter.com/ Tarot cards: scrying-for-the-shapes-of-things The most recent edition of CocoNuTs will (probably) be findable here: http://www.uvm.edu/pdodds/teaching/
UW Allen School Colloquium: Tim Althoff (Stanford University)
 
57:52
Data Science for Human Well-being Abstract: The popularity of wearable and mobile devices, including smartphones and smartwatches, has generated an explosion of detailed behavioral data. These massive digital traces provides us with an unparalleled opportunity to realize new types of scientific approaches that provide novel insights about our lives, health, and happiness. However, gaining valuable insights from these data requires new computational approaches that turn observational, scientifically "weak" data into strong scientific results and can computationally test domain theories at scale. In this talk, I will describe novel computational methods that leverage digital activity traces at the scale of billions of actions taken by millions of people. These methods combine insights from data mining, social network analysis, and natural language processing to generate actionable insights about our physical and mental well-being. Specifically, I will describe how massive digital activity traces reveal unknown health inequality around the world, and how personalized predictive models can target personalized interventions to combat this inequality. I will demonstrate that modelling how fast we are using search engines enables new types of insights into sleep and cognitive performance. Further, I will describe how natural language processing methods can help improve counseling services for millions of people in crisis. I will conclude the talk by sketching interesting future directions for computational approaches that leverage digital activity traces to better understand and improve human well-being. Bio: Tim Althoff is a Ph.D. candidate in Computer Science in the Infolab at Stanford University, advised by Jure Leskovec. His research advances computational methods to improve human well-being, combining techniques from Data Mining, Social Network Analysis, and Natural Language Processing. Prior to his PhD, Tim obtained M.S. and B.S. degrees from Stanford University and University of Kaiserslautern, Germany. He has received several fellowships and awards including the SAP Stanford Graduate Fellowship, Fulbright scholarship, German Academic Exchange Service scholarship, the German National Merit Foundation scholarship, and a Best Paper Award by the International Medical Informatics Association. Tim's research has been covered internationally by news outlets including BBC, CNN, The Economist, The Wall Street Journal, and The New York Times. April 17, 2018 This video is CC.
Deep Learning on Graphs (Neo4j Online Meetup #41)
 
51:45
Knowledge graphs generation is outpacing the ability to intelligently use the information that they contain. Octavian's work is pioneering Graph Artificial Intelligence to provide the brains to make knowledge graphs useful. Our neural networks can take questions and knowledge graphs and return answers. Imagine: a google assistant that reads your own knowledge graph (and actually works) a BI tool reads your business' knowledge graph a legal assistant that reads the graph of your case Taking a neural network approach is important because neural networks deal better with the noise in data and variety in schema. Using neural networks allows people to ask questions of the knowledge graph in their own words, not via code or query languages. Octavian's approach is to develop neural networks that can learn to manipulate graph knowledge into answers. This approach is radically different to using networks to generate graph embeddings. We believe this approach could transform how we interact with databases. Prior knowledge of Neural Networks is not required and the talk will include a simple demonstration of how a Neural Network can use graph data. ----------------------------- ABOUT THE SPEAKER ----------------------------- Andy believes that graphs have the potential to provide both a representation of the world and a technical interface that allows us to develop better AI and to turn it rapidly into useful products. Andy combines expertise in machine learning with experience building and operating distributed software systems and an understanding of the scientific process. Before he worked as a software engineer, Andy was a chemist, and he enjoys using the tensor algebra that he learned in quantum chemistry when working on neural networks. ----------------------------- ONLINE DISCUSSIONS ----------------------------- We'll be taking questions live during the session, but if you have any before or after be sure to post them in the project's thread in the Neo4j Community Site (https://community.neo4j.com/t/online-meetup-deep-learning-with-knowledge-graphs/2963). ---------------------------------------------------------------------------------------- WANT TO BE FEATURED IN OUR NEXT NEO4J ONLINE MEETUP? ---------------------------------------------------------------------------------------- We select talks from our Neo4j Community site! https://community.neo4j.com/ To submit your talk, post in in the #projects (if including a link to github or website) or #content (if linking to a blog post, slideshow, video, or article) categories. ------------------------------------------------------------------------- VOTE FOR THE PRESENTATIONS YOU'D LIKE TO SEE! ------------------------------------------------------------------------- 'VOTE' for the projects and content you'd like to see! Browse the the projects and content categories in our community site and 'heart' the ones you're interested in seeing! community.neo4j.com
Views: 3557 Neo4j
Lecture 25 — The Affiliation Graph Model | Stanford University
 
10:05
. Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for "FAIR USE" for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use. .
Free Inference and Instant Training: Breakthroughs and Implications
 
01:42:29
The fact that many commonly used networks take hours to days for training has motivated recent research towards reducing training time. On the other hand networks, once trained, are heavyweight dense linear algebra computations, usually requiring expensive acceleration to execute in real time. However, recent advances in algorithms, hardware, and systems have broken through these barriers dramatically. Models that took days to train are now reported to be trainable in under an hour. Further, with model optimization techniques and emerging commodity silicon, these models can be executed on the edge or in the cloud at surprisingly low energy and dollar cost. This session will present the ideas and techniques underlying these breakthroughs and discuss the implications of this new regime of “free inference and instant training.” See more at https://www.microsoft.com/en-us/research/video/free-inference-and-instant-training-breakthroughs-and-implications/
Views: 1868 Microsoft Research