Duties include: Develop Python repositories for efficient data ingestion, standardization, and loading of clickstream data into Snowflake. Leverage Adobe’s REST API to retrieve hourly clickstream and dimension data, allowing detailed analysis of user interactions. Work on Standardizing and transforming the collected data into a payload format, facilitating efficient storage and retrieval in Snowflake’s variant type columns. Implement encryption mechanisms to ensure data security during storage and transfer in a standard AWS S3 bucket. Configure Snow pipe for automated ingestion of clickstream data from S3 into a designated landing table in Snowflake. Develop Python repositories to handle the loading process, including decryption, dimension data integration using type 2 SCD, data masking on PII data, and loading of clickstream and person payloads into respective target tables. Overcome challenges associated with missing person identifiers due to VPN usage, incognito browsers, and different devices by employing the recordlinkage library to establish a person identity graph and assign person IDs to records. Generate actionable insights and patterns from the warehouse tables, enabling the healthcare client to understand user behavior better and make data-driven decisions to optimize their website for appointment bookings. Work on customer activity and personal data, masking PII data and enabling roles for accessing PII data. This Position requires a Master’s degree in Computer Science, Electrical/Electronics Engineering or related. Work location: Spring Hill, TN; Apply: hr@tror.ai
Need Experience
0-1 Years
Qualification
Mater’s degree in Computer Science, Software or Electrical/Electronics Engineering or related
