● Design, build, and maintain ELT data pipelines/DAGs using Airflow, Snowflake, DBT, and specific AWS services like AWS Glue and Amazon S3. ● Collaborate with other members of the data team to ensure that our data is accurate, timely, and available to stakeholders. ● Write and maintain efficient SQL and Jinja code to transform and manipulate data using DBT. ● Develop and maintain data models that support business requirements. ● Implement data quality checks and DBT tests to ensure the accuracy and completeness of our data. ● Work with stakeholders to understand and document their data requirements and provide solutions that meet their needs. ● Troubleshoot and resolve data issues as they arise.
● Created and implemented the ML project using Python, TensorFlow, Pandas, and SQL, utilizing Azure Machine Learning for robust model development and deployment, to enhance predictive accuracy for the Phoenix team of the Brain project conducting Score analysis for legal entities of a Brazilian Bank. ● Integrated APIs and managed batch processes, proficient in API standards like XML, JSON, and SOAP. ● Developed ETL routines using PySpark, SQL, and Hadoop to streamline data processing and integration for the bank’s data engineering team, resulting in a 25% reduction in data processing time. ● Automated processes using Python and Bash scripting, enhancing productivity by 30% for the operations team. ● Structured relational and non-relational databases using PostgreSQL and Apache HBase, developed new features, and maintained an application using Python and Spark. This work enhanced application performance for the product development team of a financial institution.
● Treated, manipulated, and prepared complex data for analysis and created visualizations in Power BI for data exploration and storytelling, enhancing data comprehension and decision-making for the marketing analytics project of an educational institution. ● Created queries for PostgreSQL database using SQL and harnessed the power of Python's psycopg2 library in conjunction with PySpark to build informative data tables enhancing data accessibility and analysis for the data science team. ● Refactored On-Premises pipelines to Azure Cloud, enhancing scalability and reliability for the data engineering project.
● Processed, manipulated, and prepared data for analysis and created visualizations in Power BI for data exploration, enhancing data comprehension and decision-making for the marketing analytics project of an industrial company. ● Structured relational and non-relational databases using Microsoft SQL Server and Apache HBase, developed new features and maintained an application using Python and Spark. This work enhanced application performance for the product development team. ● Supported data preparation in an Azure and GCP environment using Databricks, improving data quality and accessibility for the data engineering team. ● Monitored supplier action plans and controlled project execution deadlines, ensuring timely project completion and supplier accountability for the supply chain management team. ● Managed project reports and quality indicators (KPIs), providing critical performance insights for project stakeholders and facilitating data-driven decision-making.