I am a Ph.D. candidate at the University of Alberta, working with Professor Rich Sutton in the Reinforcement Learning and Artificial Intelligence (RLAI) lab. I am fascinated by the question of how the mind works and how we can build artificial systems that have general learning abilities. I believe reinforcement learning offers the computational theory of intelligence and I am trying to discover how agents can continually learn to perform a myriad of tasks over their lifetime!
You can find my resume here (last updated: May 2023).
- (April 2023) Oral presentation of my internship work on RL in recommender systems at a workshop in WWW’23 at Austin
- (Jun 2022) Started an internship at Google Brain
- (Apr 2022) Passed my PhD candidacy exam!
- (Mar 2022) Paper accepted at RLDM 2022
- (Dec 2021) Presented a lecture on ‘The Essentials of RL’ at the 3rd Nepal AI Winter School
- (Sep 2021) Paper accepted at NeurIPS 2021
- (July 2021) Co-hosted a ICML Social on Continuing Problems in RL
- (May 2021) Paper accepted at ICML 2021
- (May 2021) Presented two posters at NERL 2021 (one submitted, one invited)
- (Apr 2021) Paper accepted in the Journal of AI Research (JAIR)
- (Jan 2021) Started TA-ing for Rich Sutton’s CMPUT609 RL-2 course
- (Dec 2020) Helped organize the Policy Optimization in RL tutorial at NeurIPS 2020. We made some cool interactive notebooks; links on the website!
- (Oct 2020) Presented our work on ‘Personalized Brain State Targeting via Reinforcement Learning’ at the 3rd Neuromatch conference (more Q/A at the 9:58:41 mark)
Publications and Pre-prints
I mainly focusing on learning and planning methods for continual learning in RL. In particular, I design algorithms for non-episodic problems such that an agent can learn to achieve its goals from a single stream of experience (without resets or timeouts).
Investigating Action-space Generalization in RL for Recommender Systems [PDF]
Abhishek Naik, Bo Chang, Alexandros Karatzoglou, Martin Mladenov, Ed H. Chi, Minmin Chen
Oral presentation at the Decision Making for RecSys workshop, WWW, 2023.
Multi-Step Average-Reward Prediction via Differential TD(lambda) [PDF]
Abhishek Naik, Richard S. Sutton
In The Conference on Reinforcement Learning and Decision Making (RLDM), 2022.
Average-Reward Learning and Planning with Options [PDF]
Yi Wan, Abhishek Naik, Richard S. Sutton
In Advances in Neural Information Processing Systems (NeurIPS), 2021.
Towards Reinforcement Learning in the Continuing Setting [PDF]
Abhishek Naik, Zaheer Abbas, Adam White, Richard S. Sutton
In Never-Ending Reinforcement Learning (NERL) Workshop, ICLR 2021.
Learning and Planning in Average-Reward Markov Decision Processes [PDF]
Yi Wan*, Abhishek Naik*, Richard S. Sutton
In International Conference on Machine Learning (ICML), 2021.
Discounted Reinforcement Learning is Not an Optimization Problem [PDF]
Abhishek Naik, Roshan Shariff, Niko Yasui, Richard S. Sutton
In Optimization Foundations of Reinforcement Learning Workshop, NeurIPS 2019.
MADRaS: Multi Agent DRiving Simulator [PDF]
Anirban Santara, Sohan Rudra, Sree Aditya Buridi, Meha Kaushik, Abhishek Naik, Bharat Kaul, Balaraman Ravindran
In Journal of Artificial Intelligence Research (JAIR), 2021.
RAIL: Risk-Averse Imitation Learning [PDF]
Anirban Santara*, Abhishek Naik*, Balaraman Ravindran, Dipankar Das, Dheevatsa Mudigere, Sasikanth Avancha, Bharat Kaul
In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2018.
Identifying User Survival Types via Clustering of Censored Social Network Data [PDF]
S Chandra Mouli, Abhishek Naik, Bruno Ribeiro, Jennifer Neville
Technical Report, ArXiv:1703.03401, 2017.
Essentials of Reinforcement Learning
3rd Nepal Winter School in AI, Dec 2021
Towards Reinforcement Learning in the Continuing Setting
NERL workshop at ICLR 2021, May 2021
Personalized Brain State Targeting via Reinforcement Learning
The 3rd Neuromatch Conference, Oct 2020
[Video (more Q/A here), Slides]
Learning and Planning in Average-Reward MDPs
Tea Time Talks, RLAI lab and Amii, Aug 2020
On Intelligence: A Glimpse of the Diversity in Natural Intelligence
Figuring Out How the Mind Works: At the Exciting Intersection of RL, Psychology, and Neuroscience
Cognitive Psychology Seminar, Dept. of Psychology, University of Alberta, March 2020
Discounting — Does It Make Sense?
Tea Time Talks, RLAI lab and Amii, Aug 2019
This thesis was a part of my integrated Bachelor’s + Master’s program in the Dept. of Computer Science and Engineering at the Indian Institute of Technology Madras in Chennai, India, supervised by Professor Balaraman Ravindran. Defended in May 2018.
My goal was to make self-driving cars a reality in my country, India. Towards this end, I modeled it as a multi-agent learning problem in a safety-critical application and:
- proposed a risk-averse imitation learning algorithm that had lower tail-end risk w.r.t. the then state-of-the-art,
- trialled a curriculum-based learning approach for multi-agent RoboSoccer, and
- extended the TORCS simulator to release the first open-source driving simulator that supports multi-agent training — MADRaS (has 100+ stars on Github).
Google Research, Brain Team
Research Scientist Intern; June 2022 – Sep 2022; Toronto, Canada
With Bo Chang and Alexandros Karatzoglou.
- Investigated methods for action-space generalization in RL for large-scale recommender systems like YouTube.
Research Internship; May 2019 – Sep 2019; Edmonton, Canada
With Hengshuai Yao.
- Worked on establishing an appropriate problem formulation for control in continuing tasks with function approximation.
- Surveyed the literature on the average reward problem formulation for MDPs, and its connection with reinforcement learning.
- Some of the work started here was presented at the NeurIPS 2019 Workshop on Optimization Foundations of Reinforcement Learning (OPTRL 2019).
Research Internship; May 2017 – Jul 2017; Bengaluru, India
With Bharat Kaul
- Started work on a multi-agent version of the TORCS driving simulator (MADRaS) compatible with OpenAI Gym.
- Proposed and implemented a novel risk-averse imitation learning framework, achieving upto 89% improvement over the state-of-the-art in terms of tail-end risk at several physics-based control tasks.
- This project was presented at AAMAS 2018.
Research Internship; May 2016 – Jul 2016; Indiana, USA
With Bruno Ribeiro
- Engineered temporal features to design a binary probabilistic classifier to categorise the expected lifespan of new users based on their initial activity.
- Created and curated one of the richest social-media datasets and released it for public use via a technical paper.
Amazon Development Centre
Technical Internship; May 2015 – Jul 2015; Chennai, India
With Sravan Bodapati and Venkatraman Kalyanapasupathy
- Built a classifier to determine the start-reading-location of books.
- Now in production, this feature helps Kindle users start reading a book quicker after downloading it, without having to flip through pages like acknowledgements or copyright notices. If you use a Kindle, you would have experienced this!
Reinforcement Learning II (CMPUT609) (x3)
Jan - Apr 2023, 2021, 2020; Dept. of Computing Science, University of Alberta
As a teaching assistant for Professor Rich Sutton’s class of ~30 graduate students, presented some lectures, helped create the assignments and lectures, and spent a suprisingly large amount of time grading.
Reinforcement Learning I (CMPUT397)
Sep 2020 - Dec 2020; Dept. of Computing Science, University of Alberta
Helped Professor Martha White teach a class of ~150 undergraduate students.
Reinforcement Learning (CS6700)
Jan 2018 - May 2018; Dept. of CSE, IIT Madras
As the Head Teaching Assistant of this course offered by Professor Balaraman Ravindran, I created and evaluated tutorials, programming assignments, and exams for a class of about 90 undergraduates and graduates.
Principles of Machine Learning (CS4011)
Aug 2017 - Nov 2017; Dept. of CSE, IIT Madras
As one of Teaching Assistants of this course offered by Professor Balaraman Ravindran and Professor Mitesh Khapra, I created and evaluated tutorials, programming assignments, and quizzes for a class of about 90 undergraduates.
Reinforcement Learning Specialization on Coursera [Link]
Jan 2019 - Oct 2019; University of Alberta
As one of the ‘Subject Matter Expert’s, I developed programming assignments, multiple-choice quizzes, and slides for the four courses that form the RL Specialization, released in late 2019. There have been more than 10k enrollments till now!
Co-organizer, ICML 2021 Social on Continuing (Non-episodic) Problems in RL
Had insightful discussions with a bunch of people about the state of research in continuing problems and where we should go from here.
Co-organizer, NeurIPS 2020 Tutorial on Policy Optimization in RL
Alan Chan, Shivam Garg, Dhawal Gupta and I created a set of notebooks to highlight some aspects of policy-gradient methods, such as the effects of a baseline. Thanks to Sham Kakade, Martha White, and Nicolas Le Roux for giving us the opportunity!
Organizer, Tea Time Talks 2020, Amii and RLAI lab
June 2020 – Aug 2020
Organized and moderated the talks of 40+ speakers over the course of 12 weeks (in a virtual format for the first time). Full playlist here.
Executive Member, Computer Science Graduate Students’ Association, University of Alberta
Apr 2019 – Apr 2020
Along with representing the interests of the graduate students to the department, I helped organize activities which support their well-being – physically and emotionally, academically and personally – to make University of Alberta a home away from home, especially for international students.
Volunteer, Centre for Autism Services Alberta
Jan 2019 - Mar 2020
As a part of the Centre’s Community and Therapeutic program, I helped organize recreational activities for individuals in the age range of 5-20 affected with the Autism Spectrum Disorder. The aim was to create a fun and supportive atmosphere for the individuals to interact with each other and have a good time.
Interests and Hobbies
One of the fastest sport in the world, with an exhausting 60 minutes of action (yes, even while watching). The wizardry these athletes pull off while on skates is a delight to watch (shoutout to Connor McDavid! #LetsGoOilers). I am currently
learning ice-skating in order to start playing ice-hockey by early 2020! learning to play ice-hockey! playing hockey in a league!
There’s hardly anything as spectacular as this confluence of science and engineering which gives the world these lean, mean, and beautiful machines, with some of fittest athletes on the planet battling fearlessly at speeds excessive of 300 kmph over 20+ challenging tracks all over the world. Current favorite track: Spa Francorchamps, Team: Forza Ferrari forever!
If I had to pick one
thing of the few things I could do all my life, it would be reading (sports comes first). With three fat bookshelves overflowing with books back home, and many more in my handy Kindle, there are actually times when I am happy to see long queues, presenting another opportunity to dive into my latest book. Some of my favorite authors are Adrian Tchaikovsky, Ted Chiang, Andy Weir, Michael Crichton. I also read non-fiction, mostly about intelligence.
I’ve found space fascinating since I was a kid. Over the past few years, my go-to sci-fi subgenre is first contact and inter-galactical travel. But my interest in space has had a massive resurgence thanks to Kerbal Space Program and Everyday Astronaut. Instead of core AI, I might start a career in Space x AI…
Photography and Traveling
I love visiting and documenting quaint, spectacular places, and digging into the local cuisine. Till I figure out where to showcase some of my favorite pictures, here is my old Flickr account. I also enjoy trekking and hiking into the wilderness. After skydiving, bungee jumping, scuba diving, parasailing, I’m looking forward to hang gliding and cliff jumping!
Linear Algebra yields a hilariously fast method to compute Fibonacci numbers!
May 12, 2023
A post about some books that I read recently
August 28, 2021
Examples in nature that can live forever, and if we humans really aspire that
July 17, 2021
Derivation of discounted policy gradient
July 10, 2021