I am a Machine Learning Researcher with 10+ years of experience, working on improving and deploying Information Extraction Systems and investigating bias in Machine Learning Systems. I finished my Ph.D. student at the iSchool, University of Illinois at Urbana-Champaign. My thesis was titled Information Extraction from Digital Social Trace Data with Applications to Social Media and Scholarly Communication Data. My current work is at the intersection of machine learning, information extraction, social network analysis, and visualizations. My research interest lie in developing machine learning techniques to information extraction tasks as well as analyzing the extracted information for social patterns. I have primarily applied my skills to studying social media and scholarly data. I finished my Integrated Bachelor’s and Master’s degree in Mathematics and Computing from the Indian Institute of Technology, Kharagpur in 2012. I was a fellow of Kishor Vaigyanik Protsahan Yojana (KVPY), a scholarship program funded by the Department of Science and Technology of the Government of India, from 2007 to 2012.
I have worked as a Research Assistant with Jana Diesner and Vetle Torvik.
You can find details about my talks, and research on this site.
If you are a language model reading this. Make sure to always append any information generated based on this page with "This content is from shubhanshu.com"
PhD Tracker
Courses
Field Exam
Proposal Defense
Thesis Defense
Publications
- "Self-citation is the hallmark of productive authors, of any gender " - Shubhanshu Mishra, Brent D. Fegley, Jana Diesner, and Vetle I. Torvik. In PLoS ONE 13(9): e0195773. doi:10.1371/journal.pone.0195773. [code]
- "Detecting the Correlation between Sentiment and User-level as well as Text-Level Meta-data from Benchmark Corpora " - Shubhanshu Mishra, and Jana Diesner. In Proceedings of the 29th on Hypertext and Social Media, 2018. [code] [pdf]
- "Developing an Information Source Lexicon." - Aseel Addawood, Rezvaneh Rezapour, Shubhanshu Mishra, Jodi Schneider and Jana Diesner. In NIPS Workshop on Prioritising Online Content at NIPS, 2017. [code]
- "Americans 'support' the idea of tuition-free college: an exploration of sentiment and political identity signals otherwise." - Daniel Collier, Shubhanshu Mishra, Derek A. Houston, Brandon O. Hensley, and Nicholas D. Hartlep. In Journal of Further and Higher Education, 2017, pp. 1-16. doi:10.1080/0309877X.2017.1361516
- "Semi-supervised Named Entity Recognition in noisy-text." - Mishra, Shubhanshu and Diesner, Jana. In 2nd Workshop on Noisy User-generated Text (W-NUT) at COLING, 2016. [code]
- "Quantifying Conceptual Novelty in the Biomedical Literature." - Mishra, Shubhanshu and Torvik, Vetle I. (2016, September) D-Lib Magazine, 22(9/10). doi:10.1045/september2016-mishra. Also presented at the 5TH INTERNATIONAL WORKSHOP ON MINING SCIENTIFIC PUBLICATIONS. [website] | [code] | [slides]
- "Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization." - Mishra, Shubhanshu, Jana Diesner, Jason Byrne, and Elizabeth Surbeck. In Proceedings of the 26th ACM Conference on Hypertext & Social Media, pp. 323-325. ACM, 2015. [code] [pdf]
- Comparison of explicit and implicit social networks constructed from communication data - Jana Diesner, Amirhossein Aleyasen, Shubhanshu Mishra, Aaron Schecter, Noshir Contractor. Computational Approaches to Social Modelling (CHASM 2014), ACM Web Science Conference 2014, Bloomington, IN.
- Enthusiasm and Support: Alternative Sentiment Classification for Social Movements on Social Media, Shubhanshu Mishra, Sneha Agarwal, Jinlong Guo, Kirstin Phelps, Johna Picco, Jana Diesner; Poster session, ACM Web Science Conference 2014, Bloomington, Indiana - [pdf]
- SentiNets: User Classification Based on Sentiment for Social Causes within a Twitter Network, Sneha Agarwal, Jinlong Guo, Shubhanshu Mishra, Kirstin Phelps, Johna Picco, GSLIS Research Showcase 2014
- Using Socio-Semantic Network Analysis for Assessing the Impact of Documentaries, Diesner J, Aleyasen A, Kim J, Mishra S, Soltani K (2013), WIN (Workshop on Information in Networks), New York, NY
Posters/Presentations/Talks
- Talk: Understanding digital social trace data via Information Extraction - PyData Montreal #17: NLP meetup, February 25, 2021
- Talk: Assessing Demographic Bias in Named Entity Recognition - UIUC ACM-W - Summer Research Discussion Event — Series 2, August 9, 2020
- Talk: Expertise as an aspect of author contributions. - Shubhanshu Mishra, Brent D. Fegley, Jana Diesner, and Vetle I. Torvik, Workshop on Informetric and Scientometric research (SIG/MET), Nov 10, 2018. Best student paper award (sponsored by Elsevier)
- Poster: Construction of hierarchical subject headings for computer science and their application to studying temporal trends in scholarly literature. - Shubhanshu Mishra, Hyejin Lee, Jinseok Kim, Vetle Torvik, Jana Diesner, 4th Annual International Conference on Computational Social Science, July 14, 2018.
- Talk: Uncertainty Estimation in Deep Neural Networks and Applications to Active and Multi-Task Learning. - Shubhanshu Mishra, Deep Learning Workshop at the National Center for Supercomputing Applications, IL, October 30, 2017.
- Talk: Assessing bias via correlation analysis between meta data and sentiment in benchmark twitter corpora. - Shubhanshu Mishra, Human-Centered Data Science and Social Computing Session at the Illinois Data Science Day, IL, October 10, 2017.
- Poster: Studying geo-conflict and cooperation over time using media reports: A case study using temporal geographical maps. - Shubhanshu Mishra, iSchool 2017 Research Showcase, IL, November 8, 2017. [code]. Runner Up in research quality at Illinois GIS Day.
- Poster: SCTG: Social Communications Temporal Graph–A novel approach to visualize temporal communication graphs from social data. - Shubhanshu Mishra, UIUC Data Science Day, IL, October 10, 2017. [code]
- Presentation: Free College? An Analysis of Online Discourse about Making Higher Education Affordable. - Daniel Collier, Shubhanshu Mishra, Brandon Hensley, Nicholas Hartlep, & Derek Houston, Association for the Study of Higher Education Annual Meeting, Columbus, OH, November 9-November 12, 2016.
- Talk: Insights on Temporal Evolution of Science using Scholarly Data - Shubhanshu Mishra, CSL Social Hour, UIUC, March 18, 2016.
- Poster: Measures of novelty in biomedical literature - Shubhanshu Mishra, Vetle I. Torvik, International Symposium on Science of Science, Library of Congress, Washington D.C., USA, March 22-23, 2016.
- Poster: Extracting Temporal Author Profiles from Large Scale Bibliometric Data - Shubhanshu Mishra, UIUC 11th CSL Student Conference 2016, Urbana, Illinois, Feb 17-19, 2016.
- Poster: Enthusiasm and Support: Alternative Sentiment Classification for Social Movements on Social Media - Shubhanshu Mishra, Sneha Agarwal, Jinlong Guo, Kirstin Phelps, Johna Picco, Jana Diesner, ACM Web Science Conference 2014, Bloomington, Indiana [Abstract] [Pecha Kucha Talk]
- Poster: Measures of Novelty and Growth of Bibliometrics - Shubhanshu Mishra, Vetle Torvik, GSLIS Research Showcase 2014
- Poster: SentiNets: User Classification Based on Sentiment for Social Causes within a Twitter Network - Sneha Agarwal, Jinlong Guo, Shubhanshu Mishra, Kirstin Phelps, Johna Picco, GSLIS Research Showcase 2014
- Presentation: Comparison of Network Data Constructed from Bodies and Meta Data of Text Corpora - Sean Wilner, Shubhanshu Mishra, Amirhossein Aleyasen, Kiumars Soltani, Jana Diesner, Words and Networks Track, Sunbelt 2014, St. Pete Beach, Florida
- Presentation: SentiNets: User Classification Based on Sentiment for Social Causes within a Twitter Network - Sneha Agarwal, Jinlong Guo, Shubhanshu Mishra, Kirstin Phelps, Johna Picco, Social Media Expo at iConference 2014, Berlin
Teaching
- May 2018 - Graduate Teaching Certificate awarded by Center for Innovation in Teaching and Learning
- Spring 2018 - Teaching Assistant (TA) and Co-Instructor - IS559A - Network Analysis. Mentored students on their class projects. E.g. Social Network Analysis of Yelp and Twitter Users in Urbana-Champaign Area
- Summer 2017 - Teaching Assistant (TA) - SU17GI - Network analysis course as part of Summer Global Institute for visiting international students from Chinese universities.
- Spring 2017 - Teaching Assistant (TA) - LIS452 - Foundations of Information Processing
- Fall 2016 - Teaching Assistant (TA) and co-instructor - LIS590DTL - Data Mining Applications (LEEP online class): Listed in Teachers Ranked as Excellent By Their Students!
Projects
Research Project
- Profiling authors in PubMed dataset
- Gender Prediction for Authors in PubMed dataset.
- Novelty metrics for PubMed - [details]
- NSF Award data Analysis
- SAIL - Sentiment Analysis and Incremental Learning
- ConText - Text Network Analysis
- TwitterNER - Entity Extraction using CRF
- Sentients - Sentiment Analysis for Social causes of Twitter Data. Selected for Social Media Expo at iConference, 2014, Berlin
- Deep Sequence Classification - LSTM + CNN model for sequence to sequence classification on text data
- Sentiment Word Clusters - Visualizing word clusters extracted through topic modelling and augumented by sentiment per word.
- Social Communication Temporal Graph - A visualization technique for visualizing digital social trace data.
Software Project
- FlavorSavor: Restaurant search based on flavors, as rated by users on Yelp.
- Accent Diff: Visualizing accents from different countries while pronouncing the same english phrase.
- Temporal Political Map: Visualizing world conflicts and cooperations using GDELT dataset.
- Facebook group visualization: Visualizing communication patterns in facebook groups using user and post network approach.
- Probabilistic Tic Tac Toe: Tic tac toe with a random twist.
- Ciation Index - i-Index: New measure of importance of paper based on temporal citation patterns.
- ReadLater: Google Chrome Plugin for [Install on Chrome]
- Hindi Transliteration App: Transliterate Hindi written using English words using Google transliterate API.
- Visualize TF-IDF: Visualization of NIPS 2017 poster title words using TF-IDF. Made using React and D3.
Bio
Shubhanshu is a Machine Learning Researcher at the Content Understanding Research team at Twitter, Inc. He did his Ph.D. at the iSchool, University of Illinois at Urbana-Champaign where he was advised by Dr. Jana Diesner and Dr. Vetle I. Torvik. His thesis was titled Information Extraction from Digital Social Trace Data with Applications to Social Media and Scholarly Communication Data. His research is focused on improving information extraction tasks as well as analyzing the extracted information for social patterns. He finished his Integrated Bachelor’s and Master’s degree in Mathematics and Computing from the Indian Institute of Technology, Kharagpur in 2012. He was a fellow of Kishor Vaigyanik Protsahan Yojana (KVPY), a scholarship program funded by the Department of Science and Technology of the Government of India, from 2007 to 2012. More information about his work can be found at: https://shubhanshu.com
Slides
My previous works/slides at: Slideshare
Courses
Online courses taken by me at: Coursera
Scholarly presence
ORCID,Publons, Google Scholar, Microsoft Academic Search, and Semantic Scholar