Sumit Mittal

Sumit Mittal

Hey!

I'm Sumit Mittal - Founder & CEO of TrendyTech.

I transform the careers of Big data aspirants through my carefully curated masters program to help them evolve into Big data experts. I have put in my wholehearted effort to present to you the best online big data course through the experience gained by having worked on multiple challenging Big data projects as an EX-CISCO and VMware employee.

The journey began in 2018 with my passion for teaching. Started by training a few working professionals, and eventually quit my high-paying job to pursue my passion and to bring about a change in the professional lives of many.

I have incorporated effective learning approaches to master Big Data that have been assimilated over the years as an alumni of top educational organizations like NIT Trichy, BITS Pilani, and IIIT Bangalore.

Link to my website: trendytech.in

Пікірлер

  • @abhishekn786
    @abhishekn786Күн бұрын

    Dear Sir, hope you are doing well, when can we expect the next video on Python? It's been more than 3 months now, that you haven't posted any followed video on Python. I guess everyone is eagerly waiting. Please post it asap. Thanks

  • @ANIRUDH6315
    @ANIRUDH6315Күн бұрын

    sir please upload more videos. We are waiting for the next

  • @janardhanreddy3267
    @janardhanreddy3267Күн бұрын

    interview Series are good , please upload remaining 10 questions , eagerly waiting sir

  • @user-oi5pw9ly7r
    @user-oi5pw9ly7rКүн бұрын

    Master and slave architecture we have driver node acts master node multiple worker nodes submitted spark jobs contexts of spark node - create jobs and execute plan spark cluster entry point driver will request resource manager send back to driver also do iteration of jobs

  • @VenkatBala-jv2yh
    @VenkatBala-jv2yh2 күн бұрын

    Great questions....

  • @Sudeep-ow4pe
    @Sudeep-ow4pe2 күн бұрын

    The interview series is really helpful, Thank you

  • @rishiraj2014
    @rishiraj20143 күн бұрын

    One correctness here 9.59, in Narrow transformation there is no shuffling and in wide transformation there is shuffling of data.

  • @sashikiran9
    @sashikiran93 күн бұрын

    Good content!

  • @TanyaSingh-yb5hl
    @TanyaSingh-yb5hl3 күн бұрын

    please share the link of these 175 questions of leetcode..it would be very helpful

  • @niridha23
    @niridha233 күн бұрын

    Thanks for conducting these mock interviews Sumit sir. It is really helpful😊

  • @Nnirvana
    @Nnirvana3 күн бұрын

    Parquet is columnar based file format which stores the metadata along with the original data. i.e. MIN MAX values of the different columns in that file. During Read operation it checks the metadata and avoids scanning entire file that are irrelevant. Also by default it comes with Snappy compression which saves good amount of storage space.

  • @SrikarPalivela
    @SrikarPalivela4 күн бұрын

    🎯 Key points for quick navigation: 00:24 *🎓 Anur's background and expertise* - Anur has 9 years of industry experience, with over 7 years in Big Data. - Anur's primary skills revolve around Spark, AWS, cloud, data bricks, Kafka, and airflow. - Anur is currently working as an assistant manager at KPMG Global Services. 02:41 *🛠️ Ram's project overview (Insurance & Green Energy domains)* - Ram's previous project involved data injection, cleaning, transformation from raw to gold layer tables in a medallion architecture. - Ram's current project in the green energy domain includes moving data between systems, managing data warehouse, and creating a data strategy for the company. - Ram uses a variety of sources for data ingestion, from APIs to third-party sources dumped in ADLS Gen2. 05:29 *🔐 Data Security Framework Implementation* - Ram implemented an entitlement framework in Databricks for row-level and column-level masking. - Encryption methods were used, with keys stored in Azure Key Vault linked to Databricks using DBUtils secret scope. - Utilized Databricks Delta Lake for file format and ensured data security across various domains within the project. 11:44 *🔄 Resolving Pipeline Latency Bottlenecks* - Ram faced a bottleneck due to processing slowdown with accumulated data in a Databricks pipeline. - Resolved by optimizing cluster configurations, increasing CPU cores, and adjusting shuffle partitions for better performance. - Leveraged properties like cost-based optimization, join reordering, and adaptive query execution in Databricks for further pipeline optimization. 16:52 *📊 Real-time Dashboard Pipeline Implementation* - Ram's approach for a real-time dashboard involves capturing data from Oracle in a live stream, storing in Azure ADLS Gen2, and aggregating for the dashboard. - Explained the use of CDC feature, ADF integration runtime, and operations in Databricks for updating and reflecting real-time data changes on the dashboard. - Implemented triggers in Azure Data Factory, focusing on storage-based triggers to capture file arrivals for immediate processing. 21:29 *📊 Initial Spark optimization explanation* - Transformations in Spark are lazy and added to the directed acyclic graph (DAG) for evaluation when data is requested. - Spark applies lazy evaluation, leading to optimized action execution for processing only necessary data. 22:25 *🖥️ Cluster configuration and executor numbers* - Calculating ideal executor numbers and memory allocation based on data size and CPU cores. - Starting with a baseline of the number of executor cores for optimal cluster performance. 24:58 *💻 Approaching a Spark SQL problem* - Creating a cumulative revenue calculation from a DataFrame in Spark SQL. - Utilizing window functions and data manipulation to achieve the desired output. Made with HARPA AI

  • @naveenkumarsingh3829
    @naveenkumarsingh38294 күн бұрын

    bhai ye toh reverse list k case me cheating kr rha..side me notes dekh kar answer likh rha..waah

  • @rabeeahmohammedyaqeen3956
    @rabeeahmohammedyaqeen39565 күн бұрын

    are you doing on command prompt or what

  • @singhjirajeev
    @singhjirajeev6 күн бұрын

    Insert INTO emp2 Select * from emp1;

  • @bharanidharanm2653
    @bharanidharanm26537 күн бұрын

    3rd scenario is not clear. Are we updating ant congratulation setting to avoid small files problem

  • @RohitSharma-ug8rv
    @RohitSharma-ug8rv8 күн бұрын

    What is cardinality

  • @krisharjunakinjarapu3071
    @krisharjunakinjarapu30715 күн бұрын

    Cartinality tells the no of distinct values in column related to rows

  • @souravdas-kt7gg
    @souravdas-kt7gg8 күн бұрын

    with c as(select e.*,m.salary as manager_salary,m.name as manager_name from employees_prac e left join employees_prac m on e.managerid=m.id where e.salary-m.salary>0) select name from c; My solution

  • @talknow2859
    @talknow28598 күн бұрын

    Very helpful 🎉🎉🎉

  • @jhonsen9842
    @jhonsen98428 күн бұрын

    LoL Why this question relevant ?

  • @priyatamnayak2208
    @priyatamnayak22089 күн бұрын

    Dear sir I am facing an issue when giving mysql-ctl cli; which shows command not found... Kindly help

  • @vanshagarwal1355
    @vanshagarwal13554 күн бұрын

    same issue

  • @AnandPatil-eu1tl
    @AnandPatil-eu1tl10 күн бұрын

    Thank you sir , this videos are very helpful

  • @shobhittiwari2014
    @shobhittiwari201411 күн бұрын

    Sir when is next video coming?

  • @YashKarambalkar-og3sy
    @YashKarambalkar-og3sy12 күн бұрын

    Best video for starters <3

  • @user-bj3mh3nm1n
    @user-bj3mh3nm1n12 күн бұрын

    Will u teach urself or someone else?

  • @AniketPatil-yr1iw
    @AniketPatil-yr1iw12 күн бұрын

    Hi sumit sir . In this 5th video . The video description is without the topic list . Can you please add.

  • @mayanksatija684
    @mayanksatija68412 күн бұрын

    As per me, we can do the second question with below : with t1 as ( select customer_number,count(*) as count from orders group by customer_number) select t1.customer_number from t1 where t1.count = (select max(count) from t1)

  • @kmthailu2262
    @kmthailu226213 күн бұрын

    Thank you! This is very helpful

  • @mahavirsinghrajpurohit8004
    @mahavirsinghrajpurohit800413 күн бұрын

    Order by and distinct will work together if you add order by column name with select.

  • @ravulapallivenkatagurnadha9605
    @ravulapallivenkatagurnadha960513 күн бұрын

    Nice videos

  • @siddharthbarthwal630
    @siddharthbarthwal63014 күн бұрын

    very nice way to teach. thank u Sir.

  • @AnandKumar-wq3vo
    @AnandKumar-wq3vo14 күн бұрын

    with cte as ( select *, row_number() over (order by start_time) as rownum, DATEADD(MINUTE,-1* row_number() over (order by start_time) ,start_time) as updated_time from service_status where status = 'down' ) select service_name,min(start_time) as start_updated_time,max(start_time) as end_updated_time, status from cte group by service_name,updated_time,status having count(*)>3

  • @bharathKumar-or6gd
    @bharathKumar-or6gd14 күн бұрын

    Clear and Great Explanation on Where and Having Clause 👌👌👌👌

  • @rahulpandit9082
    @rahulpandit908215 күн бұрын

    Dono hi ase hn ,... Interviewer ko nind aa rhi h, aur candidate uska fayda uthana chahta h😂

  • @user-oy9cc8dv8i
    @user-oy9cc8dv8i15 күн бұрын

    if possible mention the experience also , to which experience level these interview are targeting (like this is for 1 year, fresher or for 3 year experience )

  • @prakritigupta3477
    @prakritigupta347716 күн бұрын

    My solution goes like this : with cte as ( (select attacker_king as king, region,sum(attacker_outcome) as battles_won from battle group by attacker_king,region order by attacker_king asc) union all (select defender_king as king, region, sum(case when attacker_outcome=0 then 1 else 0 end) as battles_won from battle group by defender_king,region order by defender_king asc)), cte1 as (select w.region, e.house,sum(w.battles_won) as total_wins, dense_rank() over(partition by region order by sum(w.battles_won) desc) as rn from cte as w inner join king as e on e.k_no=w.king group by w.region, e.house, w.battles_won) select region,house,total_wins from cte1 where rn=1;

  • @daamanrajput2175
    @daamanrajput217517 күн бұрын

    Hi Sir, Please let us know when next video will be uploaded

  • @PujaPrasad-qu1yi
    @PujaPrasad-qu1yi17 күн бұрын

    Hello @sumit sir, I tried for while condition for list, It's keep running for more than 20min. Can i get to know why? order_amount = [100,200,None,"invalid",300,400.5] i = 0 sum = 0 while i < len(order_amount): if type(order_amount[i]) == int or type(order_amount[i]) == float: sum = sum + order_amount[i] else: i = i + 1 continue i = i + 1 print(sum)

  • @rithvikreddy6839
    @rithvikreddy683917 күн бұрын

    Sir upload more 4 videos as soon as possible…we are eagerly waiting

  • @rithvikreddy6839
    @rithvikreddy683917 күн бұрын

    Sir please upload more 5 videos as soon as possible… we are waiting for the content

  • @sampaulson0009
    @sampaulson000917 күн бұрын

    Really sir , you clear all the doubts. Bcz I am non it background. Your way of teaching very easy and helpful.

  • @jameskhan6972
    @jameskhan697218 күн бұрын

    I think re partition happens at executor level, Executors perform the actual data movement and redistribution. They read the data from the existing partitions, shuffle it across the network, and write it into new partitions as specified by the re partitioning logic.

  • @kch8278
    @kch82787 күн бұрын

    I agree with you. Repartitn happens on executor

  • @electricalsir
    @electricalsir18 күн бұрын

    how many sneakers do have , nice collection bro

  • @electricalsir
    @electricalsir18 күн бұрын

    how many sneaker do you have

  • @Aziz-oi1qt
    @Aziz-oi1qt18 күн бұрын

    nice explanation