Profilbild von Chapa Sireesha Analyst, Senior Analyst, Analyst aus Munich

Chapa Sireesha

verfügbar

Letztes Update: 06.09.2022

Analyst, Senior Analyst, Analyst

Abschluss: nicht angegeben
Stunden-/Tagessatz: anzeigen
Sprachkenntnisse: deutsch (Muttersprache)

Dateianlagen

Sireesha_Resume_Hadoop.doc

Skills

Enterprise based applications, Hive, Impala, Sqoop, HDFS, Oozie, Cloudera hue, putty, Apache, Apache Kafka, AWS cloud services, cloud eco system, Automation / scheduling tools CA7 scheduling, Control-M, ESP, big-data database Hive , file, PARAQUET, ORC, JSON, AVRO, XML, GIT, commit / track, Zookeeper, Yarn, HBASE, Python, SQL, Shell Scripting, UML, Hadoop Ecosystems Hadoop, H-Base, Apache Spark, Nifi, Kibana, Apache Hive, Apache HBase Hadoop Distributions, Cloudera, Hortonworks, CA7, Unix / Linux, OS, Cloud Computing, Amazon Web services, AWS, Hue, Version One, GIT Hub, analytics, Teradata, mainframe, MapReduce, unit testing, test scripts, test driven development, continuous integration, backups, DOCO files, Continuous Deployment, code review, Spark, Scala, LTE, MSBI, Hadoop, scalability, Spark streaming, real time data, stateful, exception handling,Agile Development Methodology Scrum, Sqoop Eval, database, data quality, DB, SQL queries, Control M 8.0, CA 11, FTP, MVS

Projekthistorie

08/2018 - 06/2020
Senior Software Engineer
CGI

Client: PNC bank, USA
Environment: HDFS, Hive, Sqoop, Python, Oozie, Impala, CA7 workload Automation, Hue, Oozie,
Putty, Version One, GIT Hub, cloudera.
Team : LDH (Lending Data Hub)

Lending Data Hub is repository for all lending data over the lifecycle of the loan starting from
origination through servicing up to charge-off/Recovery for all 9 asset classes.
The LDH will consistently provide quality data for various lending modelling and analytical purposes
to enable model development and portfolio reporting/analytics supporting Basel(Basic Committee on
Banking Supervision),CCAR(Comprehensive Capital analysis and Review),CECL (Current expected credit
losses) and other strategic models and analytics.
LDH deals with data related to different asset classes of Resi and Non-Resi types like Auto loans,
Student, credit card, ULOC, Other consumers, Leasing, Business Banking, Home equity and Mortgage
loans of PNC bank. Data will be pulled from Teradata and mainframe tables using Sqoop and dump into
the Hive region and then HDFS. We will insert all the history and Spot data into required target
tables with required formats and then query the data to analyse the output.

Responsibilities:
* Involved in end to end process of implementing and scheduling jobs through Hive, HDFS, Sqoop,
Impala, CA7 and Oozie.
* Involve in create Hive tables, loading with Data in partitioned tables and do aggregations on
different level and writing Hive queries which will internally in MapReduce.
* Responsible for data migrating, loading and mapping; system setup and configuration; and
functional processes within the Crimson Solution packages by ensuring that the requirements
presented for customer data and systems are accurate and follow standard operational processes.
* Work closely with the business and analytics team in gathering system requirements. Develop all
mappings according to the design document.
* Export and Import data into HDFS, HBase and Hive using Sqoop.
* Support code/design analysis, strategy development and project planning.
* Evaluated Oozie for workflow orchestration in the automation of Hive jobs using python scripts.
* Undergoing unit testing and preparing the test cases for each data load before passing it to QA.
* Worked on performance tuning process to improve the jobs performance.
* Implemented test scripts to support test driven development and continuous integration.
* Involved in Data copies and Data backups to maintain data.
* Involved in ABC controls to do Automatic Pre and post data validations.
* Created DOCO files in Mainframe and pushed those jobs into Scheduling.
* Scheduled jobs through CA7 in QA and production environment for Automation. Monitoring and
scheduling Adhoc jobs in CA7.
* Extensively worked in GIT repository for Continuous Integration and Continuous Deployment.
* Responsible for Data loads in Production tables.
* Gained very good business knowledge on Banking Domain.
* Extensive code review of the peers and ability to give sessions on business, functional and
technical functionalities as a process of Knowledge Transfer sessions.

01/2016 - 08/2018
Analyst
Verizon Data Services Pvt LTD

Client: Verizon, USA
Environment: HDFS, Spark, Nifi, Scala, HBase, Kafka, Yarn, Kibana, Zoo Keeper, Horton work.
Team: IDW (Integrated Data ware housing)

Verizon Communications, Inc., is a 126 billion USD and an American multinational telecommunications
conglomerate and the largest U.S. wireless communications service provider. It operates a national 4G
LTE network covering about 98 percent of the U.S. population.
IDW is a system that collects the telephone calls data passed through different telecom switches
available across different geographic location at different scheduling periods. This system
available in MSBI before is migrated to Hadoop due to cost effectives and scalability of Hadoop
framework. This system collects the data generated in telephone exchange call information every
hourly and apply aggregations hourly, daily, weekly and monthly. The aggregations results shall be
passed to downstream systems to be utilized by business users. We will analyse Active call log data.
Responsibilities:
* As part of project, the associate was required to interact with senior client team member and
analyse requirements.
* Captured data from clusters using Nifi tool.
* Involved in moving files which located in cluster to spark streaming through Kafka by involving
NIFI.
* Implement Kafka in the development for message queuing and data connecting with Producer,
Consumer through creation of Topics.
* Used JSON as source to spark Streaming.
* Developed code on Scala to process these files using Apache Spark and do necessary aggregations
so that these results will be consumed by downstream users.
* Experience in converting the Sequence files/CSV files to JSON and storing into HDFS.
* Developing a streaming application to pilot the stream process from Kafka to Hadoop.
* Integrated Kafka with Spark streaming to do real time data ingestion and consume the data
accordingly.
* On stream data will apply filter operations and collect required data and stored in HBase.
* Handled large dataset using partitions, Spark In-Memory capabilities.
* Worked on Spark stateless and stateful Transformations.
* Use GIT as a remote repository and maintain GIT branches during project development.
* Implemented the Scala exception handling capabilities for better flow of application.
* Made performance tuning in Apache Spark.
* Used Agile Development Methodology Scrum approach in the development with effective
Retrospectives, Poker Planning and Burn down charts, Estimation Capacity and Velocity tracker.
* Gained very good business knowledge on Telecom domain.

03/2015 - 12/2015
Senior Analyst
HCL Technologies

Client: Johnson & Johnson
Domain: Health Care
Environment: Sqoop, HDFS, HIVE, HBASE, Sqoop Eval, Oozie
Team: Data fabric

Data Fabric deals all the pharmaceutical and medical related data and ingested to Data lake. Data
Fabric is an analytic system where data will be pulled from many external sources like Teradata, my
SQL and ingested to HDFS.

Responsibilities:

* Experience of installing, configuring testing Hadoop eco system components and collaborated
with the infrastructure, network, database, application and BI teams to ensure data quality
and availability.
* Created Reports for BI team using Sqoop to export data into HDFS, Hive and HBASE.
* Offload the Binary data from Teradata and Sqoop will pull the data from DB and store it in
HDFS.
* Executed a Free-Form SQL queries to import the rows in a sequential manner.
* By using Incremental parameter with Data import synchronised the data in HDFS.
* Used Sqoop Eval tool to run Sample SQL queries against Teradata and preview the results on
Console.
* Writing CLI commands using HDFS.

06/2013 - 03/2015
Analyst
HCL Technologies

Tools Used: Control M 8.0, CA7, CA 11, ESP, Z/os, Service Now,
Client : DIXONS STORE GROUP INTERNATIONAL
Role: Batch Management
Description: Dixons Stores Group is one of the leading retail chains in UK. The project aims at
designing and developing software that could be used by a company to maintain Employee record. Here
we store the details of Employee such as salary, mail-id, designation, date of joining, extension,
etc in database that can be accessed at any time by the authorized people. The property of batch
processing is used at the end of month to calculate the payroll of all the employee of the
organization, according to their respective department and grades.
Roles and Responsibilities:
* Monitoring the production batch job cycles flow through CA-7 provide report statistics to client
and to provide immediate fixes to the production abends to insure successful job completion.
* Adhoc Scheduling jobs. And changing schedule of the jobs.
* Resolving abends like Space, FTP and DB-Contention and notify data errors to respective teams.
* Monitoring & Issuing Various Commands on MVS Consoles
* To restart, rerun, Cancel, force complete, the abended jobs as per the requirement.
* Perform miscellaneous setup actions such as rename/delete /copy Datasets.
* Ensure all the online regions are up on time. Analysing long running, late jobs and lookbacks.

Reisebereitschaft

Weltweit verfügbar
Profilbild von Chapa Sireesha Analyst, Senior Analyst, Analyst aus Munich Analyst, Senior Analyst, Analyst
Registrieren