data engineer interview questions

In this video I have talked about salting in spark
Directly connect with me on:- topmate.io/manish_kumar25
Discord channel:- / discord
Project details for resume :-
.Successfully led a data engineering project in a retail environment using technologies such as Apache Spark, Python, SQL, and Amazon S3 to optimize data processing.
.Implemented structured data models, including dimension and fact tables, to provide valuable context for point-of-sale data analysis.
Designed and executed an incentive program based on sales performance, enhancing motivation among sales teams by rewarding top performers.
Managed extensive daily data volumes of approximately 100GB, demonstrating the ability to handle large-scale data pipelines.
Employed Spark optimization techniques like caching and broadcast joins to improve data processing speed and efficiency.
Utilized Azure CI/CD pipelines for code deployment, and orchestrated workflows using Airflow and CRON jobs.
Detailed writeup to explain more during interview:-
As a Data Engineer on a project for a prominent offline grocery and kitchen supplies retailer, I applied my expertise in data engineering to drive critical improvements in their data processing and analysis operations.
The project primarily focused on processing and analyzing point-of-sale data, which was structured into dimension and fact tables to provide meaningful context for sales analysis. To further enhance employee motivation and performance, we designed and implemented an incentive program that rewarded salespeople with the highest sales volumes in each store.
Handling a substantial daily data volume of approximately 100GB, we leveraged Apache Spark and applied optimization techniques like data caching and broadcast joins to significantly accelerate data processing. This not only improved the speed of our data pipelines but also increased the efficiency of our data analysis.
We seamlessly integrated the code deployment process into the Azure CI/CD pipeline. As part of workflow automation, we orchestrated task scheduling using Airflow and CRON jobs.
One of the project's major achievements was the implementation of a customer engagement strategy that identified infrequent buyers and provided incentives in the form of coupons. This initiative not only boosted customer retention but also had a positive impact on the overall business growth.
For more queries reach out to me on my below social media handle.
Follow me on LinkedIn:- / manish-kumar-373b86176
Follow Me On Instagram:- / competitive_gyan1
Follow me on Facebook:- / manish12340
My Second Channel -- / @competitivegyan1
Interview series Playlist:- • Interview Questions an...
My Gear:-
Rode Mic:-- amzn.to/3RekC7a
Boya M1 Mic-- amzn.to/3uW0nnn
Wireless Mic:-- amzn.to/3TqLRhE
Tripod1 -- amzn.to/4avjyF4
Tripod2:-- amzn.to/46Y3QPu
camera1:-- amzn.to/3GIQlsE
camera2:-- amzn.to/46X190P
Pentab (Medium size):-- amzn.to/3RgMszQ (Recommended)
Pentab (Small size):-- amzn.to/3RpmIS0
Mobile:-- amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
Laptop -- amzn.to/3Ns5Okj
Mouse+keyboard combo -- amzn.to/3Ro6GYl
21 inch Monitor-- amzn.to/3TvCE7E
27 inch Monitor-- amzn.to/47QzXlA
iPad Pencil:-- amzn.to/4aiJxiG
iPad 9th Generation:-- amzn.to/470I11X
Boom Arm/Swing Arm:-- amzn.to/48eH2we
My PC Components:-
intel i7 Processor:-- amzn.to/47Svdfe
G.Skill RAM:-- amzn.to/47VFffI
Samsung SSD:-- amzn.to/3uVSE8W
WD blue HDD:-- amzn.to/47Y91QY
RTX 3060Ti Graphic card:- amzn.to/3tdLDjn
Gigabyte Motherboard:-- amzn.to/3RFUTGl
O11 Dynamic Cabinet:-- amzn.to/4avkgSK
Liquid cooler:-- amzn.to/472S8mS
Antec Prizm FAN:-- amzn.to/48ey4Pj

Пікірлер: 114

  • @payalbhatia6927
    @payalbhatia69275 күн бұрын

    @Manish Kumar. All of your videos are more than a gem if anything exists like this. I am 4-5 YOE and never get to learn spark in such a depth , clarity , concise answers , questions. It is useful for 10 YOE as well I can vouch for it. I have ADHD issue, but your videos are too engaging that I can sit for long with it. I have got interested in learning. You must be an extra ordinary guy. Having knowledge is one thing , presenting it , putting it in so simple manner is what stands you apart. It is very difficult to be simple . Thanks once again

  • @aprao8014
    @aprao80149 ай бұрын

    bhai iss video se mein fan hogya aapka. "logon ke pass experiance nahi hain, aur company ko experiance chahiye" 🔥 🔥

  • @hritikapal683
    @hritikapal6839 ай бұрын

    What a gem content sir 🥺 thankyou so much for in-depth video!

  • @shadabahmed8817
    @shadabahmed88179 ай бұрын

    Thanks a lot manish bhaiya, u listen to even individual request. Big thabks to you. Loving your content.

  • @rameshjadhav4963
    @rameshjadhav49635 ай бұрын

    Hey Manish, I'm extremely thankful to you and all of your playlists. Especially this video is super problem solver one! No one teaches in so much depth as you do. Thanks for taking out time to teach us!!😇🤗

  • @rishav144
    @rishav1449 ай бұрын

    Thanks for amazing content . Spark playlist is amazing

  • @WolfmaninKannada
    @WolfmaninKannada9 ай бұрын

    Sir your amazing. No one has created content till now on this.Wish to see more on this type of content .Being a fresher we need to have a clear idea about how the project works and we should know how to explain project to interviewer.

  • @user-yd1fu7hl6j
    @user-yd1fu7hl6j9 ай бұрын

    Wow... Manish bhai really loved this content. Please, I will encourage you to do more videos like this.

  • @ManaviVideos
    @ManaviVideos9 ай бұрын

    Thanks for the session!!

  • @MCAMadeEasy
    @MCAMadeEasy5 ай бұрын

    Bhai apko salute, ekdam sidha, saaf or sach bolne ke liye I will definitely connect with you on Top Mate after getting my Data Engineering job, to thank you! Hopefully usse pehle connect krne ji jarwat na pade

  • @swapnalikudale2458
    @swapnalikudale24587 ай бұрын

    This whole PlayList helped a lot.💡

  • @varunmehta591
    @varunmehta5919 ай бұрын

    bhai bhai 🙌...ultimate video ❤❤

  • @abhayc8015
    @abhayc80159 ай бұрын

    Thank You manish bhaiya

  • @ajaypatil1881
    @ajaypatil18818 ай бұрын

    thank you so much bhaiya for amazing content 💝

  • @user-gm7fn8ri8i
    @user-gm7fn8ri8i9 ай бұрын

    Great Manish you don't say fake e,Xperince many times in your video your doing great job

  • @vivekpuurkayastha1580
    @vivekpuurkayastha15809 ай бұрын

    Great video Manish .. What you have face problem while doing your projects and how to resolve it . Please answer this question as experience person.

  • @nipun384
    @nipun3849 ай бұрын

    THANKSS VROOO LOVE U YR CLEARED INTERVIEW

  • @mnshrm378
    @mnshrm3784 ай бұрын

    Hey Manish! I am following all playlists and content, also I have given more than 30 interviews but have not been selected yet because of a scheduler if you can cover any of them or can cover a pipeline including airflow or any one schedular it will be very helpful. Without schedular knowledge, it's incomplete because each and every interview they are asking for it. You are explaining very well so I want to have an explanation in your depth knowledge. Thanks.

  • @ig2947
    @ig29474 ай бұрын

    Amazing...!!!

  • @vikastiwari9415
    @vikastiwari94159 ай бұрын

    amazing content..

  • @ajaypatil1881
    @ajaypatil18818 ай бұрын

    most exciting video

  • @__oo__._._._._._._._.___00007
    @__oo__._._._._._._._.___0000711 күн бұрын

    Thank you

  • @tanushreenagar3116
    @tanushreenagar31167 ай бұрын

    Thanks too much

  • @abhijeetsugam
    @abhijeetsugam8 ай бұрын

    bahut acha video hai

  • @shubhamdeshmukh6339
    @shubhamdeshmukh63392 ай бұрын

    Thanks I just completed the playlist

  • @sanooosai
    @sanooosai3 ай бұрын

    thank you sir

  • @shadabahmed8817
    @shadabahmed88179 ай бұрын

    Waiting for 2nd part eagerly , related to last project. Please next time usi ko upload krna.

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    I did not get you. Maine saari chije Jo Maine project karwayi hai usi ke related batayi thi. Aap sayad Pura video nhi dekhe hai, ya fir main question nhi samjha

  • @shubhamalsunde3230
    @shubhamalsunde32309 ай бұрын

    nice content Sir

  • @SantoshKumar-yr2md
    @SantoshKumar-yr2md4 ай бұрын

    universal truth of industry you explained

  • @dakait0867
    @dakait08679 ай бұрын

    bhai ek CI/CD par practical detailed video bana do usnign azure devops/databricks please tht will be great help

  • @vaibhavpoul1067
    @vaibhavpoul10679 ай бұрын

    What a content sir ❤most needed

  • @bobbytheman7535
    @bobbytheman75359 ай бұрын

    Why do we need layers in datawarehouse? Can we put for each loop inside another for each loop?

  • @ChandikaRohini
    @ChandikaRohini5 ай бұрын

    make a video on coding questions and scenario questions(ex:what if the repartition size increases, how to handle out of memory issues and possible questions which are encountered.

  • @widelens_world
    @widelens_world9 ай бұрын

    how to analyse our source data in our project so that where we have to perform cleaning operation

  • @user-yd1fu7hl6j
    @user-yd1fu7hl6j9 ай бұрын

    Bhai thoda cluster se related incoming data se related chalnegs batao 2-3

  • @loveyourselffirst6565
    @loveyourselffirst65654 ай бұрын

    bhaya aws map reduce pe ek video banao naa, please...

  • @rh334
    @rh3349 ай бұрын

    How to do ONPREMISE to CLOUD migration.

  • @user-yd1fu7hl6j
    @user-yd1fu7hl6j9 ай бұрын

    Bhai table me columns and row ketne and kis type ke hai like - cust_id,refund columns kitne ho skte hai or kis kis type ke bata de

  • @anweshkumarsahoo3927
    @anweshkumarsahoo39277 ай бұрын

    Shall I add personal project section along with work experience section in Resume for 2 YOE in DE ??

  • @bhavyamalviya8364
    @bhavyamalviya83647 ай бұрын

    😂😂i thoroughly learnt and enjoyed this video

  • @parameshwarbhange9857
    @parameshwarbhange98579 ай бұрын

    What you have face problem while doing your projects and how to resolve it . Please answer this question as experience person

  • @prashantmehta2832
    @prashantmehta28322 ай бұрын

    Thank you so much sir for the great explanation, It was the best series I have found in my life. I just have one request from you. Can you please make a video on Cluster manager - Yarn.

  • @manish_kumar_1

    @manish_kumar_1

    2 ай бұрын

    Noted

  • @prashantmehta2832

    @prashantmehta2832

    2 ай бұрын

    @@manish_kumar_1 Thanks sir..

  • @roshan_off1955
    @roshan_off19559 ай бұрын

    Bro, scheduling jobs me airflow to nhi kia to use question puchega tab kya karenge

  • @abhishekchaturvedi9855
    @abhishekchaturvedi98557 ай бұрын

    Manish going through all of your videos I realized almost all of the optimization is based on number of rows. Do we have any optimization where data increases in terms of columns?

  • @shaikhrizwan9907
    @shaikhrizwan99079 ай бұрын

    Manish Awesome videos, can you make some videos on Aws Glue job..

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    Mujhe nhi aata hai glue

  • @SATISHKUMAR-qk2wq
    @SATISHKUMAR-qk2wq9 ай бұрын

    I was asked this kind of questions in interview

  • @user-rh1hr5cc1r
    @user-rh1hr5cc1r2 ай бұрын

    Bhaiya,, ye spark submit config cloud me toh kahin v nai mila databricks me cluster banate time spark submit jo on prem me karte hai...unke liye hai ky?

  • @raghavendrakulkarni3920
    @raghavendrakulkarni39208 ай бұрын

    Platform metric used ?

  • @eagleeyetradingacademy
    @eagleeyetradingacademy5 ай бұрын

    can we use this project for 3-4 yrs of experiance

  • @Wandering_words_of_INFJ
    @Wandering_words_of_INFJ8 ай бұрын

    Hello Sir, firstly thankyou for this amazing content. Truly grateful. I request you to please make an Azure Data Engineer project real project questions to prepare for the interview by collaborating that with databricks. Please

  • @manish_kumar_1

    @manish_kumar_1

    8 ай бұрын

    Mujhe Azure ke services ki idea nhi hai

  • @Wandering_words_of_INFJ

    @Wandering_words_of_INFJ

    8 ай бұрын

    @@manish_kumar_1 okay sir, by the way, aise sirf Pyspark Developer ki koi position ni dikhti, aap in future skill sets k upar video banaynge kya ki kon si skills resume par mention karni hai and what are the relevant positions in the industry?

  • @dhavalkacha2481
    @dhavalkacha24819 ай бұрын

    If as a fresher if i mention a project in my resume can I say i completed in 1or 2 months

  • @desmond7182
    @desmond71825 ай бұрын

    Please make a video for a freshers(0 years of exp).

  • @rh334
    @rh3349 ай бұрын

    Can you make content about KAFKA

  • @sonjoysaha5454
    @sonjoysaha54546 ай бұрын

    great work. informative video. love it. I have a question about the data you receive. Do you receive 100 GB of new data every day?

  • @manish_kumar_1

    @manish_kumar_1

    6 ай бұрын

    Not in every project but in last project I had an opportunity

  • @sameersuryawanshi145
    @sameersuryawanshi1459 ай бұрын

    Muze Spark use krte time error ara h , pls help error like 'remote rpc client issue' due to executor lost failure heartbeat issue pls help

  • @poojajoshi871
    @poojajoshi8719 ай бұрын

    Hi Manish, Got call wherein they are asking to hv exp into AWS glue n pyspark. Please tell me how to incorporate glue with pyspark

  • @poojajoshi871

    @poojajoshi871

    9 ай бұрын

    Spark I know , glue is etl tool..toh how to use spark with glue

  • @kunalk3830
    @kunalk38308 ай бұрын

    Q.)Data skew is one example for which you do spark optimization, apart from data skew for what you have performed optimization for? Q.)What kind of Issues you have faced in your project while working? Matlab iss question ka ek right systematic approach chahiye tha, idea toh hai topics ka but when I think it the points seems to be scattered.

  • @anketsonawane6651
    @anketsonawane66519 ай бұрын

    Hey Manish can you make video on end to end data engineering project it will be very much helpful to understand data engineering pipeline

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    Aapne na Pura video dekha aur na hi i button me add Kiya hua link. Already project karwa diya hai and link v diya hua tha

  • @anketsonawane6651

    @anketsonawane6651

    9 ай бұрын

    @@manish_kumar_1 Sure Manish... I regret for wrong comment. I will surely check it out and thanks for this amazing content ❤️

  • @koeld830
    @koeld83023 сағат бұрын

    What is delta cache?

  • @rahulrai4686
    @rahulrai46867 ай бұрын

    Dsa aana chahiye kya ... Ya phir kisi aur per dhyan dena ha ok

  • @DEwithDhairy
    @DEwithDhairy6 ай бұрын

    DSA interview series for Data Engineer kzread.info/head/PLqGLh1jt697wQTamFvXx_Odlm-Wg3zbxq&si=suGxMRqt-uoYkprY

  • @Soccerfan_17
    @Soccerfan_179 ай бұрын

    How you analysis your source data before start cleaning?

  • @rawat7203

    @rawat7203

    9 ай бұрын

    We will 1st remove the non csv files Read the correct files into dataframe using spark We will check if these correct files have the mandatory columns, if not then remove these files If some of these files have extra columns then add a column called extra column and put all these columns there Now we will have dataframe with all correct data, Now to this dataframe we join dimension table dataframe and create a Final DF On this final DF we do spark processing to get the desired calculation

  • @rahulrai4686
    @rahulrai46867 ай бұрын

    Sir aapse personal me kaise baat kar skte hai hum

  • @princyanghan8734
    @princyanghan87349 ай бұрын

    Why did you stop uploading videos sir, please keep sharing.

  • @manish_kumar_1

    @manish_kumar_1

    8 ай бұрын

    Started again

  • @sanketraut8462
    @sanketraut84629 ай бұрын

    can we say our source and sink is same like hadoop hdfs?

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    Yes

  • @greendaywithtrading7408
    @greendaywithtrading74089 ай бұрын

    Why did you stop uploading videos ??? eagerly waiting for new video

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    I was out of station due to job requirements

  • @user-iz5hj1ep8s
    @user-iz5hj1ep8s9 ай бұрын

    Python and spark code questions bhi bata do abhi sir .....

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    Already bata rakha hai company specific in one of the playlist

  • @anweshkumarsahoo376
    @anweshkumarsahoo3768 ай бұрын

    Manish Bhiaya apne jo aapke resume mein BWAC,MHCDM ye sab keywords use kiye hain wo sab apke roles hai ??

  • @manish_kumar_1

    @manish_kumar_1

    8 ай бұрын

    Nhi, projects ke name hai

  • @raajnghani
    @raajnghani9 ай бұрын

    How to unbroadcast the dataframe?

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    Set the configuration of broadcast threshold to -1

  • @cretive549
    @cretive5499 ай бұрын

    Sir maths kitna required h data engineer profile me please reply

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    Nhi required hai

  • @rajaprasad-vv2rf
    @rajaprasad-vv2rf2 ай бұрын

    How many nodes we use in our project

  • @manish_kumar_1

    @manish_kumar_1

    2 ай бұрын

    Nodes are used in cluster. When job is scheduled then we don't mention the no of node, rather we use number of executor and more than 1 executor can start on the same node

  • @avinash7003
    @avinash70039 ай бұрын

    still calls are there for bigdata AWS?

  • @prabhatgupta6415

    @prabhatgupta6415

    9 ай бұрын

    are u not getting??

  • @avinash7003

    @avinash7003

    9 ай бұрын

    @@prabhatgupta6415 what is the present market about AWS?

  • @prabhatgupta6415

    @prabhatgupta6415

    9 ай бұрын

    i m azure guy..sir@@avinash7003

  • @rahulrai4686
    @rahulrai46867 ай бұрын

    Sor puthon language kitna aana chahiye hume

  • @ishwarkoki1119
    @ishwarkoki11198 ай бұрын

    Manish bhai, thumbnail mai spelling galt ho gaya hai related ka !

  • @manish_kumar_1

    @manish_kumar_1

    8 ай бұрын

    Oh, thanks for pointing it out

  • @sathyak3285
    @sathyak32858 ай бұрын

    Please talk in English, so that everyone will understood. And pls give answers for the questions

  • @ruchim3448
    @ruchim34488 ай бұрын

    is it complete playlist ?

  • @manish_kumar_1

    @manish_kumar_1

    8 ай бұрын

    Yes

  • @ruchim3448

    @ruchim3448

    8 ай бұрын

    @@manish_kumar_1 thank you.

  • @jhonsen9842
    @jhonsen98422 ай бұрын

    One Like and One Comment.

  • @amangurjar9714
    @amangurjar97149 ай бұрын

    can fresher become data engineer

  • @shubhamchavan9438

    @shubhamchavan9438

    9 ай бұрын

    agar ye saval saal bhar puchega, to nahi ban payega, lekin ek bar puchke saal bhar practice kareka to ban jayega

  • @amangurjar9714

    @amangurjar9714

    9 ай бұрын

    I should buy some course for data engineer or I should prepare from KZread only and make online project??

  • @rakeshverma6867

    @rakeshverma6867

    9 ай бұрын

    @@shubhamchavan9438

  • @BigDataWithSky
    @BigDataWithSkyАй бұрын

    What Don't you talk in English 😢for non Hindi speaker😊

  • @user-tb8ry2jl7s
    @user-tb8ry2jl7s9 ай бұрын

    Sir got placed as azure data engineer, its all because of you really thank you for everything 🥹🥹 i would like to talk with you

  • @manish_kumar_1

    @manish_kumar_1

    9 ай бұрын

    Congratulations bhai. Aap linkedin ya Insta par ping kijiye. Social media handle ka link description me mil jayega

  • @likhithurs8597

    @likhithurs8597

    7 ай бұрын

    Hi heartily congratulations for your success 🙌

  • @simizcodding4487

    @simizcodding4487

    7 ай бұрын

    Hey I contact to u ...plz drop ur linkedin id

  • @surajpoojari5182

    @surajpoojari5182

    4 ай бұрын

    Congratulations bro

  • @chinnasaiprathapmeesala8977

    @chinnasaiprathapmeesala8977

    3 ай бұрын

    Bro can you share your interview preparation questions for Azure Data engineer