Description

Prashant chopra

902.452.4242 · pchopra@hotmail.com

Canadian Citizen · Data Engineer · AWS Certified Cloud Practitioner

Project and results-oriented IT professional with more than 13 years of IT experience on analysis, development, and deployment of data centric solutions for various industries like retail, banking, telecom, insurance, energy and government involving large scale data warehouse and real time analytics.

Skills
·         Big Data/Hadoop: HDFS, YARN, MapReduce, Hive, Spark, Impala, Sqoop, Flume, Cloudera, Hortonworks HDP, Ambari, Ranger (Familiarity: Kafka)

·         Programming: Python, SQL, PL/SQL, HiveQL

·         Schedulers: Airflow, Autosys, Tidal

·         Methodology: SDLC, Waterfall, Agile, JIRA
·         SQL/NOSQL: Hive, Spark SQL, Oracle, ORC, Parquet (Familiarity: Cassandra, Redshift)

·         Cloud: MS Azure, AWS, GCP (Familiarity: Databricks, Redshift, BigQuery)

·         Tools: PyCharm, VS Code, Zeppelin, Jupyter, Subversion, Git, Docker

Experience
Nov 2018 – present
data engineer, bell mobility
Being part of the process of corporate ETL process modernization I am part of the analysis, design and development of a new solution which integrates Python, Spark, Hive, Parquet, JSON, Airflow, YAML, Oracle, Cloudera and BigQuery to delivers a high trough output of data and transformations from a wide variety of applications for advertisement outreach and revenue optimization.

 

– Work with internal business and technology staff to accurately capture requirements and
specifications for the design of database models to service reporting requirements.

– Understanding of the business rules, business logic and use cases to be implemented and
architecting a unified code base which is extensible and scalable.

– Developed Hive/MapReduce/Spark Python modules for viewership analytics.

– Developed data pipelines (ingest/clean/munge/transform) for feature extraction towards
predictive analytics and ad revenue optimization.

– Prototype and validate serverless data integration/pipeline solutions on AWS and GCP.

Jun 2017 – Jul 2018
Solutions Consultant (Big Data), t4g
Worked on design, architecture, and implementation of big data pipeline and HDFS ingestion from various sources for efficient process and supporting real-time queries and analysis for DSS.

– Development of Big Data projects using Hadoop, HDFS and data lakes.
– Design, build, and support cloud and open source systems to process large amounts of data.
– Importing and exporting data between HDFS and RDBMS using Sqoop/Hive to implement
transactional support and incremental loads.
– Creation of RDD’s and Dataframes for the input data and performed transformations using
PySpark to ingest/catalog/store/analyze new datasets with final analytics.
– Used Spark APIs to perform necessary transformations and actions on the fly for building data
pipelines which gets the data in batch and real-time in lambda architecture.

– Big Data Analytics, ETL, Data Analysis and Visualization on Hortonworks and Azure Platforms .

– As part of this team, I develop and perform experimental design approaches to validate findings
or test hypotheses to investigate, develop and propose new analytic capabilities.

 

OCT 2016 – FEB 2017
Hadoop Developer, CIBC@TCS
– Collaborated with Solutions Architect to define the best strategy to migrate data sources, load
to, extract from, and interact with Hadoop (CDH distribution) for the Client Reporting solution
this in accordance with defined Enterprise Data Pattern
– Implemented and modified scripts to load and extract from Hadoop using Sqoop, Hive, and
Impala for required client reporting and ad-hoc query solutions.
– Working on Spark to be able to explore, cleanse and analyze data by manipulating RDDs,
dataframes and using Spark SQL to be able to provide a SQL interface and interoperability with
Hive and other data sources. Data cleansing, profiling and transformations on flat file data is
being done to store files in parquet format.

may 2016 – sep 2016
Technical Lead (Band 8), IBM Canada
– Working as Technical Lead with a team of 2 developers.
– Writing SQL, PL/SQL objects for CRM application and involved in performance tuning activities.
– Defined process and procedures for development, code reviews and deployments.

 

apr 2015 – apr 2016
senior consultant, NTT Data
– Involved in data ingestion activities from raw files and relational databases to HDFS for further
processing and analysis using SQOOP.
– Writing Hive queries to load and process data in Hadoop distributed file system.
– Writing basic spark transformation scripts.
– Created PL/SQL stored procedures, functions and packages for moving the data from staging
area to data mart.
– Performed SQL and PL/SQL tuning and Application tuning using various tools like EXPLAIN PLAN,
SQL*TRACE, TKPROF and AUTOTRACE.
– As lead to function on logistics for smooth on-boarding and initial training of new resources and
on-going technical and logistical support to team of 5.

mar 2012 – apr 2015
oracle consultant, cgi inc
– Managed a team of 4 developers, as a lead worked as facilitator for the team acting as bridge
between client and local team for logistic and administrative support.
– Conducting daily scrum meeting, generating weekly and monthly status and performance deck
for the project
– Tracking and generating effort logging reports within the team and for upper management
– Followed all Client coding processes and Client designated frameworks for PL/SQL development
– Analyzed data to check for Data Integrity and Referential Integrity when loaded to source-
staging tables.
– Developed PL/SQL packages, procedures, triggers, functions, Indexes and Collections to –
implement business logic. Generated scripts for data manipulation and validation and
materialized views for remote instances.
– Created SQL DDL’s scripts and DML’s statements/scripts to store and manipulate the data which
is stored across various databases and schemas
– Developed the SQL*Loader scripts to load data into the staging tables from the flat files.
– Participated in the development team through team meetings and Client required training.

may 2011 – feb 2012
software designer, Royal bank of scotland
– Planning, work allocation, tracking. Preparing implementation plan and break glass strategy.

– Extensively working on change requests which are implemented using PL/SQL programming
which involves manipulating existing scripts & packages, version control and deployment.

– Modified and created new functions in existing shell scripts for data and file manipulation,
calling Oracle stored procedures and executing queries through scripting.

– Liaison with technical and business stakeholders to define work stack and expectations

oct 2006 – nov 2010
senior software engineer, mahindra satyam
– Designed/developed/Modified tables, views, materialized view, stored procedures,
packages and functions.
– Coded PL-SQL packages and procedures to perform data loading, error handling and logging.
– Tuned database SQL statements and procedures by monitoring run times and system statistics.
– Created new Procedures, Functions, Triggers, Materialized Views, Packages, Simple, Ref &
Traditional Cursors, Dynamic SQL, Table functions as part of Project/Application requirements.
– Created partitioned tables and indexes to improve the performance of the applications.
– Implemented all modules logic by using Triggers and Integrity Constraints.
oct 2006 – nov 2010
programmer analyst, oracle corporation
– Developing packages, procedures, functions, and triggers for the application.
– Performing the tests on the newly coded procedures and documenting the same.
– Writing PL/SQL code using the technical and functional specifications
– Development using SQL, PL/SQL, HTML, Oracle Application Express, Oracle Discoverer Reports.
Education
aug 2004
Master of sciences(IT), Punjab Technical University
july 2000
bachelors of commerce, Delhi University