Marian-Andrei Rizoiu

Professional Webpage

Marian-Andrei Rizoiu.

I am a Senior Lecturer in Behavioral Data Science with the Data Science Institute at the University of Technology Sydney, where I lead the Behavioral Data Science lab, studying human attention dynamics in the online environment. I am interested in stochastic behavioural modelling of human actions online, at the intersection of applied statistics, artificial intelligence and social data science.

Presenting at the conference ICTAI 2012, November, Athens

Research

My research has made several key contributions to online popularity prediction, real-time tracking and countering disinformation campaigns, and understanding shortages and mismatches in labour markets.

First, I developed theoretical models for online information diffusion, which can account for complex social phenomena. My models answer questions such as “Why did X become popular, but not Y?” and “How can problematic content be detected based solely on how it spreads?”. Second, I built skill-based real-time occupation transition recommender systems. These systems link social media predicted personality profiles with worker occupation attributes to construct personalised career recommendations.

See more about my research.

News

See the lab's news page for more recent news.

2019-10: Together with Amelia Johns and Fracesco Bailo, I was awarded a prestigious Facebook grant on Using computational modelling of user behaviour and machine learning to counter the diffusion of hate speech across social media.

2019-10: I wrote a piece for the influential media outlet The Conversation entitled Can hiding likes make Facebook fairer and rein in fake news? The science says maybe, which received significant attention on social media (Twitter & LinkedIn), from the FEIT and UTS media departments. Subsequently, I was interviewed on the radio about the work and the phenomenon.

Back to Top

My Publications.

Note: Please head off to the lab's publications page for the list of publications.

Research student theses

Back to Top

Research topics.

My research models human attention dynamics in the online environment. I do this by building tools and methods to detect and predict online information flows. One example to illustrate its application is in the detection of disinformation campaigns. My method distinguishes between controversial and reputable news sources based solely on how information spreads in the online environment without analysing the content of the news articles. This is significant because it has allowed me to confirm that social systems react to misinformation in detectable ways. It paves the way for building content-free early detection systems.

I also build real-time occupation transition recommender systems. I leverage large datasets of online job ads to quantify the similarity of skills. These systems help to optimise labour markets by measuring the effective deployment of skills; which skills workers can leverage in a new occupation; and which occupations are most adequate for a given set of skills. Furthermore, my research links - for the first-time - social media predicted personality profiles with worker occupations to build personalised career recommendation systems. This is significant as it can recommend job transition pathways, identify training gaps, and optimise transition periods – especially impactful during labour market disruptors like COVID-19.

A little more details

My current research interest is to model theoretically popularity on online media, as well as estimate the influence of media content and network characteristics on online attention. We established a generative model that predicts online attention, based on an exogenously-driven Hawkes self-exciting processes. We also examine the geographical diffusion of media content over time and the goal is to generate statistical descriptions of content diffusion over time and geographical areas. We are handling very large Twitter datasets (the network), which relate to Youtube videos (the content).

My previous work dealt with how partial expert information can be leveraged into a non-supervised learning algorithm that treats complex data. This complex data is of different natures (text, image), it is temporal and structured, linked to knowledge repository (e.g. ontology) and/or labeled. Semi-supervised clustering is used to model the additional information (structure, labels, time) and to inject the heterogeneous information into the learning algorithm. A series of application emerge from the theoretical research: using the temporal dimension to detect temporal patters and typical evolutions, using the image labels to improve image numerical representation and an automatic topic evaluation using concept trees.

Sage Research Methods interview

Below is an interview given to Sage Research Methods in which I describe my research. The interview discusses using stochastic computational models to study the popularity of online videos, including a focus on computational social science, current research, data collection and access, identification of relevant data, how data is prepared and processed, determining the computational models to use and applying them, training or organizing the data, adjustments to the research, running the models, next for this research, and advice for others interested in this type of research.

Back to Top

Grants & funding, projects and invited talks

Competitive grants and funding - AU$0.6M

Research projects

Seminars and invited talks

Back to Top

Teaching.

Advanced Databases
and Data Mining

ANU Undergraduate

This course is for the third year of undergraduates in the Research School of Computer Science. It presents relational theory and conceptual modelling; privacy and security; statistical databases; distributed databases; data warehousing; data cleaning and integration; and data mining concepts and techniques. I give lectures concerning databases and data warehousing and data cleaning and integration.

Document Analysis
 

ANU Postgraduate

This course is for the third year of undergraduates in the Research School of Computer Science, as well as Honnors students. It presents techniques related to processing online document, such as (A) information retrieval, (B) natural language processing, (C) machine learning for documents, and (D) relevant tools for the Web. I give lectures concerning the machine learning part and the social media and sentiment analysis part.

Numerical Machine Learning

European M2 DMKM

This course is for the second year in the Excellence European Master DMKM. It presents advanced machine learning techniques. Together with S. Lallich , we present association rules mining and class rules mining, ensemble methods (bagging, boosting) and statistical testing procedures (cross-validation, student t-test, etc.).

I give practical lectures for this course.

Back to Top

Biography.

I am a Lecturer in Computer Science with the Faculty of Engineering and IT in the University of Technology Sydney.

Previously

Between March 2016 and January 2019, I was a Research Fellow, then Lecturer with the College of Engineering and Computer Science at the Australian National University in Canberra. I was equally affiliated with the Data61 unit of CSIRO, in the Decision Sciences team.

Between May 2014 and February 2016,I was a researcher within the National ICT Australia in Canberra Australia, working in the Optimization Research Group. I was equally an adjunct lecturer with the College of Engineering and Computer Science at the Australian National University in Canberra.

Between September 2013 and May 2014, I was a PostDoctoral researcher with the ERIC Laboratory, financed by the ImagiWeb Research Project. I was equally an assistant professor with the DIS Department, at the University Lumière in Lyon.

Between 2009 and 2013, I was a PhD student at the ERIC Laboratory with the University Lumière Lyon2, under the supervision of Stéphane Lallich and Julien Velcin. I defended my PhD thesis on June 24th 2013, with honors "Très Honorable".

In July 2009, I obtained my MSc (graduating first of promotion, with honors.) in Data Mining and Knowledge Management from the Polytechnic School of the University of Nantes, France and wrote my Master’s Thesis on "Textual Data Clustering and Cluster Naming" after an internship at the ERIC Laboratory.

Between 2004 and 2009, I did my undergrad and obtained in September 2009 my Engineer Diploma (double diploma, in parallel with the French Master's) in System and Computer Engineering from the Faculty of Automatic Control and Computers of the Polytechnic University Bucharest, Romania.

Old News

2019-10: Together with Amelia Johns and Fracesco Bailo, I was awarded a prestigious Facebook grant on Using computational modelling of user behaviour and machine learning to counter the diffusion of hate speech across social media.

2019-10: I wrote a piece for the influential media outlet The Conversation entitled Can hiding likes make Facebook fairer and rein in fake news? The science says maybe, which received significant attention on social media (Twitter & LinkedIn), from the FEIT and UTS media departments. Subsequently, I was interviewed on the radio about the work and the phenomenon.

2019-04: My interview with Sage Research Methods just got published, in which I discuss using stochastic computational models to study the popularity of online videos.

2019-03: I am doing a one month research visit at the French CNRS laboratory Hubert Curien, where I will be working with Christine Largeron on linking the dynamics online communities and information diffusion processes.

2019-01: I just joined the Faculty of Engineering and IT in the University of Technology Sydney as a Lecturer in Computer Science.

2018-09: We've got the attention of the media! Both the Business Insider and the ANU Reporter wrote about our findings concerning the bot influence in the 2016 US Elections.

2018-06: Just returned from ICWSM 2018 in Palo Alto, California, where our team presented three papers. I also visited and gave invited talks at Netflix Research and Facebook Core Research. See more details here.

2018-05: After the WWW 2018 conference in Lyon, France, I have visited for a week the Max Planck Institute for Software Systems in Kaiserslautern, Germany, hosted by the team of Manuel Gomez Rodriguez, one of the top groups world-wide in stochastic modeling.

2018-03: Three of our papers just got accepted at ICWSM 2018, the top computational social science conference. Rendez-vous in Stanford, California, US.

2018-01: Our slick-looking HIPie popularity explorer tool will be showcased at the WWW 2018 conference in Lyon, France. Check out it's video tutorial, Github repository and live installation.

2017-12: Our paper "SIR-Hawkes: on the Relationship Between Epidemic Models and Hawkes Point Processes" has just been accepted at the WWW 2018 conference in Lyon, France!

2016-05-03: Check out our brand new Computational Media lab page.

2016-03-16: Our work got the attention of Wikimedia Research! I presented "Evolution of Privacy Loss on Wikipedia" in the March session of the Monthly Wikimedia Research Showcase. See recording here.

2015-10: Our paper "Evolution of Privacy Loss on Wikipedia" has just been accepted at the WSDM 2016 conference in San Francisco!

2015-09: Our paper "ClusPath: A Temporal-driven Clustering to Infer Typical Evolution Paths" has been accepted for publication with the journal of Data Mining and Knowledge Discovery

This year 2015, I give lectures about Machine Learning and Social Media and Sentiment Analysis in the context of the Document Analysis course, with the Research School of Computer Science at the ANU.

Since May 2014, I am at the NICTA lab, Canberra, Australia, as a researcher, working with Manuel Cebrian, Lexing Xie and Pascal Van Hentenryck.

Starting from May 2014, I will be at the NICTA lab, Canberra, Australia, in a research visit and I will be working with Manuel Cebrian and Pascal Van Hentenryck.

On the 24th of June, I defended my PhD thesis, entitled "Semi-Supervised Structuring of Complex Data", and I obtained the title of PhD from the University Lumière Lyon 2. The thesis manuscript is available here.

In July, I will be at the NICTA lab, Australia, in a research visit and I will be working with Tiberio Caetano.

In August, I will participate at the Doctoral Consortium of the International Joint Conference on Artificial Intelligence. My submission entitled "Semi-Supervised Structuring of Complex Data", has been accepted for publication in the Proceedings of IJCAI'13.

The paper "Unsupervised Feature Construction for Improving Data Representation and Semantics", written with J. Velcin and S. Lallich , has been accepted for publication in JIIS.

I was part of the organizational committee and head of the volunteers team for the joint organization of the conferences ALT'12 and DS'12.

I gave a lecture on Using a Pareto Front for a Non-Supervised Feature Construction Algorithm .

Back to Top