Vertical: V6 - Data Sciences

Notice Board -

Indian researchers/scientists/academician/Ph.D. students may send an email at vaibhav_ds_appl@iitbhu.ac.in or vaibhav_ds_appl@iitbhu.ac.in or vaibhav_project_management@iitg.ac.in or vaibhav_education@iitg.ac.in or vaibhav_ds_security@cmmacs.ernet.in to attend session(s) of V6-Data Science vertical with session ID as provided in Session Schedule. You will get a link to attend the session on your email after approval.

   Horizontals

V6H1 : Data Science Project Management
  • Data science project life-cycles in theory and practice
  • Requirements in collection, integration, preparation, cleaning, structuring
  • Case Studies – India & Abroad; similarities & differences
  • Key challenges in data science project management – Indian context & Abroad; similarities & differences
  • Key challenges in government while implementing data science projects
  • Challenges in maintaining the large enterprises
  • Pain-points and open challenges (to be documented in report)
  • Way forward: Solution proposals (to be documented in report)

V6H2 Vision of Data Sciences

In this session, we look forward to an expert insight on the overall promise of data science, the range of application areas we can expect it to impact, how we can measure the impact of data science in those areas, a vision of the future in a short-term (5 year) and long-term (20 year) time frame, and what are the current impediments to attain those dreams for this field.

India is in a unique position when it comes to leveraging data science and analytics. Indian enterprises have certainly witnessed their fair share of innovation in data science, particularly in sectors such as banking, manufacturing, retail and healthcare. It is exciting to glimpse into the future of this interdisciplinary field of data science, its prospects, road-blocks and dangers.

Topics:

  • Range of Application areas
  • Metrics of impact
  • Vision of the future (5 years, 20 years)
  • Current impediments

V6H3 : Data Privacy and Security

While data science is a critical enabler of efficiency and effectiveness in modern organizations, many times it ends up with entrusting sensitive data to external agencies. This brings challenges both in terms of security and privacy. Security relates to the protection of personal information from threats, while privacy addresses how personal data is collected, managed, stored and shared.  The threat landscape has changed fast over the last few years, with the emergence of more and more cyber physical systems. Data breaches are increasingly common, affecting individuals and the world’s biggest companies. The current pandemic have increased the challenges manifolds.

 

The panel will discuss specific issues of data protection both in terms of privacy and security, at each stage in the data life-cycle (collection, integration, preparation, cleaning, structuring, verification, annotation, task allotment, crowd-sourcing). The panel will discuss on the data protection framework practiced elsewhere in the world to serve as the point of reference for developing similar legislations/laws for India.  Also, the panel will through lights on some of the common people’s concerns about personal data collected by the Government, by mobile applications etc. and the potential consequences of misuse of data, their legal compliance and data protection requirements.

 

Potential Outcomes:

Recommendations and Open Challenges: Suggestion for an India centric data framework with focus on security and privacy

Way forward: Discuss some of the use cases of societal and strategic importance as potential areas of collaborations having short-term and long-term deliverables.


V6H4 : Data Science and Education  
  • Skill requirement in top-level industry / academics
  • High-skill curriculum / topics (to include in 1st tier engg. colleges)
  • Recommendations (to be documented in report)
  • Medium-skill curriculum / topics (can include in 2nd/3rd tier engg. colleges)
  • Curriculum / topics for all (can include in schools)
  • Recommendations (to be documented in report

Possible topics / skills to include in session

Best practices in choosing data science stacks and their benefits / drawbacks. Must include open source options. Options for general programming (e.g. Python & R), data manipulation (e.g. Pandas), ML (Scikit-learn), querying (SQL/NoSQL), data visualization and presentation (e.g. Tableau), deep-learning (TensorFlow (most popular) or PyTorch (growing fastest))


V6H5 : Data Science Applications  

Session 1: Image, Video, and Textual Data Analytics: Current and Future Trends (3 hours)

Advancement of data science tools and techniques has motivated researchers from all over the world to focus on more and more complicated real-life applications by exploiting multimedia data (i.e., image, video, and text data). Effectiveness of Data Science has already been demonstrated in a wide range of applications including image, video and text analytics. The session will focus on discussing some of the important application areas and figure out possible future research directions in these domains. Important applications of image and video analytics include extracting features of images/videos captured by surveillance cameras using ML/DL techniques to obtain useful scene-related information in an automated manner for crowd activity understanding, behavioral anomaly detection, etc., enabling effective crowd management. With the availability of low-cost cameras and sophisticated video technology, large amount of image/video data gets captured and shared over the internet. Often these images and videos disclose the identity of an individual without his/her consent. Face de-identification techniques can be used to obfuscate the identity of individuals while simultaneously preserving their other non-biometric traits. Usefulness of face de-identification in Indian perspective and possible solutions to the problem can be considered as important points of discussion in this session. Biometric recognition/re-identification using spatio-temporal features extracted from videos, summarization, 2D/3D pose estimation and pose transfer, image and video analytics in autonomous vehicles for collision detection, avoidance, and path planning are among some other topics of interest. Recent advances of generative models will be discussed with emphasis on Generative Adversarial Networks (i.e., GANs), since these networks have been widely used in the recent past to develop models for performing a variety of image translation tasks such as image de-fencing, de-noising, etc., effectively. In the domain of text analytics, focus will be given to discussion on computer-aided learning for translation of an Indian language to another Indian language, translation of an Indian language to English language, information retrieval and extraction, and developing recommender systems based on collaborative filtering, content-based filtering and hybrid models. Panellists will discuss on how Data Science can be used to develop a platform that enables learning Indian languages in an easier way since such a platform can allow us to partly solve the contentious language problem in India. 

 

Session 2: Data Mining and Application of Data Science in Smart Cities, Health Care Management and Agriculture (3 hours)

In today’s world, where we are surrounded by data, finding the meaning of (i.e., interpreting) the data has become one of the important tasks for researchers. This interpretation becomes easier if one can find patterns in the data. Finding patterns in data is one of the major sub-tasks in data mining. This pattern mining, applied to high dimension data where frequent data patterns are searched for, is also known as Frequent Itemset Mining (FIM). High utility itemsets (HUIs) is a subfield of frequent itemsets mining (FIM) which is one of the fundamental research topics in data mining. Compared to the FIM, utility mining provides more informative and actionable information. Different data mining and pattern mining techniques can be applied to data stored or represented in different structures. Big graphs or networks represent such data structures. With increasing popularity and importance, different graphs and/or networks like social networks have attracted the attention of many researchers and companies. To understand the properties of a graph or network, graph generation is an important task in graph mining. Networks like social network can be used to influence people or communities by any idea, product etc. These objectives can be fulfilled by influence maximization techniques or viral marketing. Connecting people with others and/or recommending products to customers can be done through graph mining techniques like link prediction and community detection. This session will focus on discussing techniques for pattern mining, high utility item set mining, and graph mining including graph generation, community detection, link prediction, influence maximization, viral marketing, and graph-concept generation. In addition to this, panellists will also discuss on how data science can be applied in building smart cities including developing and monitoring sustainable urban infrastructure and buildings, improving the power supply through smart grid technology, detecting and counteracting problems with aging urban infrastructure, deploying ubiquitous sensing devices to facilitate everyday activities in a crowded urban environment, and assist in agriculture which is the backbone of Indian economy. Points of discussion will include cyber-physical infrastructure for smart cities, green computing, bio-inspired control using data from animals, computational biology, computational imaging, cyber-security, medical informatics, video analytics, etc. Additional points of discussion in this session may include the benefits of data science in agriculture through satellite data-based soil, crop monitoring, weather prediction, fertilizer recommendation, spectroscopy data analysis to assess the quality of soil, forecasting production, and related issues. The session will also focus on application of data science in health care. About 1.2 billion clinical documents/images are getting produced in the United States every year. Often it becomes difficult for doctors and medical healthcare professionals to retrieve the relevant information from this pool of medical data. This brings out the need for developing a better, more informed healthcare. Unstructured medical data can be processed to gain a deeper understanding of latest medicines, disease detection techniques, and several other useful diagnostic information. Presently, AI/ML tools are being extensively used by data scientists and machine learning experts all over the world to achieve the above goal. Data Science has immense potential to revolutionize healthcare in the coming years through disease prediction and management, as well as through design and development of automated software tools for management of diseases from text, image and video data.


Session Schedule

Horizontal

Session ID

Name

University/Organisation

Country

Data Science Project Management

V6H1S1

Data Science Project Management

3/10/20

8-11pm (IST)

 

Prof. Abhishek Chandra

University of Minnesota, Twin Cities

United States of America

Prof. Swadhin K. Behera

JAMSTEC

Japan

Prof. Jaideep Srivastava

University of Minnesota

United States of America

Dr. Bibhas Chakraborty

National University of Singapore

Singapore

Dr. Hari Koduvely

Senior Data Scientist at Micro Focus

Canada

Dr. Vijay Mago

Lakehead University

Canada

Dr. Om Patri

Cloud Data Architect at Amazon

Canada

Dr. Chirag Shah

University of Washington

United States of America

Mr. Shreenidhi Bharadwaj

North western University

United States of America

Prof. PonnurangamKumaraguru

IIIT Delhi

India

Prof. Ramesh Loganathan

IIIT Hyderabad

India

Dr. Sanasam Ranbir Singh

IIT Guwahati

India

Prof. D. Janakiram

IIT Madras

India

Prof. Santosh Biswas

IIT Bhilai

India

Dr. Vijaya Saradhi

IIT Guwahati

India

Prof. Ratnajit Bhattacharjee

IIT Guwahati

India

Vision of Data Sciences

V6H2S1

Vision of Data Sciences

 

8/10/20

7-10 pm (IST)

 

Prof. Jaideep Srivastava

University of Minnesota

United States of America

Prof. Barath Narayanan

University of Dayton

United States of America

Prof Srinivas Aluru

Georgia Institute of Technology

United States of America

Prof. Chandan K. Reddy

Virginia Tech

United States of America

Prof. Amit Sheth

University of South Carolina

United States of America

Prof. Anirban Dasgupta

IIT-Gandhinagar

India

Prof. Vikram Pudi

IIIT Hyderabad

India

Dr. Shailesh Kumar

Jio Digital Life

India

Prof. P. J. Narayanan

IIIT Hyderabad

India

Dr. Vijaya Saradhi

IIT Guwahati

India

Prof. Jayant Haritsa

IISc, Bangalore

India

Data Privacy and Security

V6H3S1

Data Privacy and Security

18/10/2020

6-9pm (IST)

 

Prof. Shishir Nagaraja

University of Strathclyde Glasgow

United Kingdom

Prof. Sourav Sen Gupta

Nanyang Technological University

Singapore

Prof. Vijay Atluri

Rutgers

United States of America

Prof. Jaideep Srivastava

University of Minnesota

United States of America

Prof. Bhavani Thuraisingham

University of Texas at Dallas

United States of America

Prof. Sujoy Sinha Roy

University of Birmingham Edgbaston

United Kingdom

Dr. Prem Mohan Shukla

Zscaler Inc, California

United States of America

Prof. Saraju Mohanty

University of North Texas

United States of America

Prof. Arindam Banerjee

University of Minnesota, Twin Cities

United States of America

Prof. Vijay Vardharajan

University of Newcastle

Australia

Prof. PonnurangamKumaraguru

IIIT Delhi

India

Dr. Vinayak Godse

Data Security Council of India

India

Prof. Somnath Tripathy

IIT Patna,

India

Dr. Sushmita Ruj

ISI, Kolkata

India

Dr. Mayank Swarnkar

IIT BHU

India

Dr VidyadharMudkavi

CSIR-4PI

India

Dr Pronab Mohanty

UIDAI

India

Dr Rajesh Pillai

SAG, DRDO

India

Dr SK Pal

SAG, DRDO

India

Dr. G.K. Patra

CSIR-4PI

India

Data Science Education

V6H4S1

Data Science Education

21/10/2020

7-10 pm (IST)

 

Prof. Swadhin K. Behera

JAMSTEC

Japan

Prof. Devavrat Shah

Massachusetts Institute of Technology

United States of America

Prof. Biplav Srivastava

University of South Corolina

United States of America

Prof. Aluru Srinivas

Georgia Institute of Technology

United States of America

Dr. Bibhas Chakraborty

National University of Singapore

Singapore

Prof. Padmanabhan Seshaiyer

Geor Mason University

United States of America

Dr. Anshul Vikram Pandey

Accern

United States of America

Prof. Ratnajit Bhattacharjee

IIT Guwahati

India

Prof. Samar Agnihotri

IIT Mandi

India

Prof. Madhavan Mukund

CMI, Chennai

India

Prof. Sashikumaar Ganesan

IISc, Bangalore

India

Prof. Sumohana S Channappayya

IIT Hyderabad

India

Prof. Hemangee Kalpesh Kapoor

IIT Guwahati

India

Prof. SamerandraDandapat

IIT Guwahati

India

Prof. S. R. Mahadeva Prasanna

IIT Dharadwad

India

Prof. Rajeev Srivastav

IIT BHU

India

Dr. Anil Kumar Singh

IIT BHU

India

Dr. Pratik Chattopadhyay

IIT BHU

India

Prof. Puneet Bindlish

IIT BHU

India

Dr. Vijaya Saradhi

IIT Guwahati

India

Prof. Rohit Sinha

IIT Guwahati

India

Data Science Applications

V6H5S1

Image, Video, and Textual Data Analytics: Current and Future Trends

13/10/2020

8-11pm (IST)

 

 

Prof. Sandeep Gupta

Arizona State University

United States of America

Prof. Biplav Srivastava

University of South Corolina

United States of America

Prof. Prasenjit Mitra

PennState College of Information Scenes and Technology

United States of America

Prof. Anuj Srivastava

Florida State University

United States of America

Prof. Barath Narayanan

University of Dayton

United States of America

Prof Pradeep Atrey

SUNY, Albany

United States of America

Prof. Rajeev Srivastava

IIT (BHU), Varanasi

India

Dr. Pratik Chattopadhyay

IIT (BHU), Varanasi

India

Dr. Anil Kumar Singh

IIT (BHU), Varanasi

India

Dr. Sukomal Pal

IIT (BHU), Varanasi

India

Dr. Sukhada

IIT (BHU), Varanasi

India

Prof. R. Balasubramanian

IIT Roorkee

India

Dr. Sanjeev Kumar

IIT Roorkee

India

Dr. ParthaPratim Roy

IIT Roorkee

India

Dr. Ramesh Anbanandham

IIT Roorkee

India

Prof C V Jawahar

IIIT Hyderabad

India

V6H5S2

Data Mining and Application of Data Science in Smart Cities, Health Care, Management and Agriculture

16/10/2020

8-11pm (IST)

 

Prof. Saraju P Mohanty

University of North Texas

United States of America

Prof. Barath Narayanan

University of Dayton

United States of America

Prof. Nitesh Chawla

University of Notre Dame

United States of America

Prof Srinivas Aluru

Georgia Institute of Technology

United States of America

Prof. Rajeev Srivastava

IIT (BHU), Varanasi

India

Dr. Pratik Chattopadhyay

IIT (BHU), Varanasi

India

Dr. Bhaskar Biswas

IIT (BHU), Varanasi

India

Dr. Ajay Pratap

IIT (BHU), Varanasi

India

Dr. Prasenjit Chanak

IIT (BHU), Varanasi

India

Dr. R. Balasubramanian

IIT Roorkee

India

Dr. ParthaPratim Roy

IIT Roorkee

India

Dr. Ramesh Anbanandham

IIT Roorkee

India

Prof Kamal Karlapalem

IIIT Hyderabad

India