Embrace The Red

19 сағат бұрын

Backdooring Keras Models and How to Detect It (Machine Learning Attack Series)

Пікірлер

@donatocapitella9 күн бұрын

Thank you for sharing this!

@embracethered9 күн бұрын

Thanks for watching! Check out the related blog post also. Also, let me know if there is any content you'd like to see covered in future. 🙂

@octopus314119 күн бұрын

Great stuff 👍

@embracethered19 күн бұрын

Thanks for the visit and note. Appreciate it! Let me know if there are any relevant topics you'd like to see covered?

@Agathozerk21 күн бұрын

nice video bru

@embracethered20 күн бұрын

Thanks! Let me know if there are other topics of interest?

@user-or7kk7gh8uАй бұрын

Can you please share what .py file you has run on this video to monitor chatgpt3.5 chat (print-data-exfiltration-log.py) under code please share

@embracetheredАй бұрын

It was just a script that filters the web server log for requests from ChatGPT user agent and only shows the query parameter and no request IP - so it's easier to view. You can just grep /var/log/ngninx/access.log also (assuming you use nginx on Linux). I can see if I still have the script somewhere but it wasn't anything special.

@pez5491Ай бұрын

Gold!

@embracetheredАй бұрын

Thanks!!

@maloseevanschaba73432 ай бұрын

Perfect straight to the point,

@embracethered2 ай бұрын

Thanks for watching!

@Astranix592 ай бұрын

What wordlist file do you use?

@embracethered2 ай бұрын

Depends, a common source to get started is: github.com/danielmiessler/SecLists. Also, quite significant are the mutations and rulesets that are being used by the way.

@Astranix592 ай бұрын

@@embracethered thank you!!

@chitchatvn52082 ай бұрын

Thanks. Great content!

@embracethered2 ай бұрын

Glad you liked it!

@chitchatvn52082 ай бұрын

Thanks Yohann.

@embracethered2 ай бұрын

Thanks!

@chitchatvn52082 ай бұрын

Thanks Yohann.

@embracethered2 ай бұрын

Glad you found it interesting! Thanks for checking it out!

@chitchatvn52082 ай бұрын

thanks Yohann.

@embracethered2 ай бұрын

Thank you! Hope it was useful! 🙂

@chitchatvn52082 ай бұрын

Thanks Johann.

@embracethered2 ай бұрын

You are welcome!

@6cylbmw2 ай бұрын

I didn't really understand the vulnerability impact. You are exfiltrating own chat (user A) to own drive (user A) drive. How is it exploitable?

@embracethered2 ай бұрын

Attacker is causing the Chatbot to send past chat data to attackers server (in this case a google doc is capturing the exfiltrated data). Check out the linked blog post, explains it in detail.

@endone36612 ай бұрын

what is this ?

@embracethered2 ай бұрын

It's about a Jupyter Notebook that allows to self-study prompt injection and to experiment and play around with the technique by solving a set of challenges.

@th3pac1fist2 ай бұрын

🔥

@embracethered2 ай бұрын

Thanks!! It's probably one of my most interesting videos.

@RandomAccess23 ай бұрын

[Environment]::SetEnvironmentVariable("SSLKEYLOGFILE", "c:\temp\sslkeys\keys", "MACHINE") netsh trace start capture=yes tracefile=c:\temp\sslkeys\trace.etl report=disabled netsh trace stop

@notV3NOM3 ай бұрын

Thanks , great insights

@embracethered3 ай бұрын

Thanks for watching! Glad it was interesting.

@erinclay49173 ай бұрын

How'd you get that cool paint splash effect around your head? What software are you using?

@embracethered3 ай бұрын

Thanks! It's just a custom image I created. drew a white circle on black background - then zigzagged that splash effect over with a brush and then use a filter for webcam in OBS to blend it in.

@void-qy4ov3 ай бұрын

Great tut. Thanks 👍

@embracethered3 ай бұрын

Glad it was helpful! Thanks for watching!

@Sway554 ай бұрын

how to do it for traffic outside of browser? say I have a desktop app

@TheHologr4m4 ай бұрын

Was not expecting this in the playlist.

@embracethered4 ай бұрын

Haha

@petraat88064 ай бұрын

im trying to understand what just happened please can someone explain

@embracethered4 ай бұрын

You can read up on the details here: embracethered.com/blog/posts/2023/google-bard-data-exfiltration/ And if you want to understand the big picture around LLM prompt injections check out this talk m.kzread.info/dash/bejne/o62ItbGMdKipZbA.html Thanks for watching!

@kajalpuri34044 ай бұрын

Thank you so much. Exactly the video I needed.

@embracethered4 ай бұрын

Glad it was helpful!

@plaverty94 ай бұрын

I just tried this, but the only difference is I was capturing this information over HTTP instead of SMB. Does that make a difference? I ask because I was trying to generate a proof of concept where I controlled the username and password going in, but it wouldn't crack. I tried four different times and it didn't work. Is something different when these are captured over HTTP instead of an SMB connection?

@embracethered4 ай бұрын

Good question. First thought is that it should just work the same, but I haven't tried. Relaying def works, that I have done many times in past.

@plaverty94 ай бұрын

Thanks. I had a colleague try it too, and got the same result as I did. This is for a pentest proof of concept, so I’m not in position to relay unfortunately.

@netor-3y44 ай бұрын

@347my4554 ай бұрын

superb!

@embracethered4 ай бұрын

Thank you!🙏

@Fitnessdealnews4 ай бұрын

One of the best presentation I’ve seen

@embracethered4 ай бұрын

Thanks for watching! Really appreciate the feedback! 😀

@MohdAli-nz4yi4 ай бұрын

I think a better conclusion is: never put in the context of an LLM information you need to keep private, because it will leak.

@embracethered4 ай бұрын

Thanks for watching and the note. I think that misses the point that the LLM can attack the hosting app/user, so developers/users can't trust the responses. this includes confused deputy issues (in the app), such as automatic tool invocation.

@MohdAli-nz4yi4 ай бұрын

@@embracethered Agreed! So 2 big points: 1. Never put info in LLM context you don't want to leak. 2. Never put untrusted input into LLM context, it's like executing arbitrary code you have downloaded from the internet on your machine. LLM inputs must always be trusted, because the LLM will "execute" it in "trusted mode".

@embracethered4 ай бұрын

@@MohdAli-nz4yi (1) I agree we shouldn't put sensitive information, like passwords, credit card number, or sensitive PII into chatbots. For (2) The challenge is that everyone wants to have an LLM operate over untrusted data. And that's the problem that hopefully one day will have a deterministic and secure solution. For now the best advise is to not trust the output. e.g. Developers shouldn't blindly take the output and invoke other tools/plugins in agents or render output as HTML, and users shouldn't blindly trust the output because it can be a hallucination (or a backdoor), or attacker controlled via an indirect prompt injection. However, some use cases might be too risky to implement at all. And its best to threat model implementations accordingly to understand risks and implications.

@ludovicjacomme18044 ай бұрын

Excellent presentation, thanks a lot for sharing, extremely informative.

@embracethered4 ай бұрын

Thanks for watching! Glad to hear it's informative! 🙂

@artemsemenov81364 ай бұрын

Thank you, is awesome!

@embracethered4 ай бұрын

Glad you like it!

@artemsemenov81364 ай бұрын

@@embracethered I'm a fan of yours, I've talked about your research at cybersecurity conferences in Russia. You're awesome.

@embracethered4 ай бұрын

Thank you! 🙏

@artemsemenov81364 ай бұрын

@@embracethered what you think abot LLM security scanners, garak and vigil. Also, have you met P2SQlinjection in the real world ?

@macklemo59684 ай бұрын

🔥

@embracethered4 ай бұрын

Thanks! 🚀🚀🚀

@jlf_4 ай бұрын

I really enjoyed your talk, Johann! Thank you!

@embracethered4 ай бұрын

Thanks for watching and glad you enjoyed it! 🙂

@ninosawas35685 ай бұрын

Great video! Very informative. Interesting to see how the LLMs ability to "pay attention" is such a large exploit. I wonder if mitigating this issue would lead to LLMs being overall less effective at following user instructions

@embracethered5 ай бұрын

Thanks for watching! I believe you are correct, it's a double edged sword. The best mitigation at the moment is to not trust the responses. Unfortunately it's hence impossible at the moment to build a rather generic autonomous agent that uses tools automatically. It's a real bummer, because i think most of us want secure and safe agents.

@isiltarexilium7986 ай бұрын

How can I use annother host (as neuroai.host) instead of openai?

@madjack8216 ай бұрын

Is this blocked on some routers? I’ve tried this with my current network at the house and “key content” doesn’t show on the screen. I am running as administrator and previous networks are showing key content.

@mortenwormdue35936 ай бұрын

Only works, if the traffic comes from the browser - in your example, chrome provides the session keys. So, no - not really workable on a server.

@0q26286 ай бұрын

love this idea :)

@embracethered6 ай бұрын

Thanks for watching! Yes, LLMs are awesome and fun to experiment with.

@owowhatsthis....30256 ай бұрын

Thanks helps a lot. from 🇩🇪

@embracethered6 ай бұрын

Glad it helped! Thanks for watching!

@balonikowaty6 ай бұрын

Great work Johann, as always! The more we give access to other data sources. which include documents, the more we expose each other to indirect injection attacks. It is worth pointing out that instructions could have been made in white ink size 0.1, making the document look normal!

@embracethered6 ай бұрын

Much appreciated!

@fire171026 ай бұрын

When does bard decide to load and use a doc? Is it only when stated in the prompt? Or can we set up a file that will be implicitly called on every prompt? Something like AI_SAFETY_MANIFEST_-_MUST_BE_READ_ON_EVERY_USER_PROMPT.doc 😏

@fire171026 ай бұрын

Read the post, really good I guess these sort of procedures will work across many different stacks and companies Also I wonder if you log your attempts, probably allot of wisdom can be drawn from your first attempt evolving to the last. You got it on the 10th try. Maybe showing a smart llm all 10 of those could find patterns. Effectively creating a prompt optimizer thay bring you faster results next time. All the best

@embracethered6 ай бұрын

Thanks for the note! Yes, this is a very common flaw across LLM apps. Check out some of my other posts about Bing Chat, ChatGPT or Claude. Yep, on the iteration count - spot on. A lot of initial tests were around basic validation that injection and reading of chat history worked, then the addition of Image rendering, then in context learning examples to increase reliability of the exploit.

@LukmaansStack7 ай бұрын

in development environment the cookies are setting but in production environment the cookies are not setting what is the solution for this issue please help

@embracethered7 ай бұрын

Thanks for watching! Seems like a developer question, it might be related to the domain or path properties of the cookies when they get set

@user-nl4qz3ej1y7 ай бұрын

Hi, for SSH agent forwarding to work, the ssh-agent service must first be initiated on our local machine. However, I'm confused that does it work there as well? Upon reviewing the SSH source code, it is evident that SSH utilizes the "AF_UNIX" family to establish a connection to the ssh-agent socket.

@embracethered7 ай бұрын

Hello, thanks for watching. Hope itvwas interesting. I’m not sure if I understand the question? But yeah, ssh-agent can run locally or remotely also.

@cedric606667 ай бұрын

Thanks for explaining this. I guess it would also work with "private" instances of ChatGPT or equivalent system, as long as the user input is not sanitized ...

@embracethered7 ай бұрын

Thanks for watching. I’m not sure how private instances work (or what they exactly are), but presumably yes, unless they put a configurable Content Security Policy or some other fix in place to not allow images to render/connect.

@levinsdurai43508 ай бұрын

is without port posible in wiindows like mac and ubuntu ?

@aitboss858 ай бұрын

Can you please explain to me what is the saturn you typed in the browser? Is this a custom defined protocol to connect to your machine? and how can I do the same? Thank you!

@embracethered8 ай бұрын

Hi there, thanks for watching. It’s just the name of a web server, it’s using http protocol. can omit typing http(s) in most browsers.

@aitboss858 ай бұрын

@@embracethered I still can't figure out how to do it 🥹

@bicks4436Ай бұрын

@aitboss85 the most simple way to do this without dns is to just add the name you want (ie saturn) and the IP address to your hosts file. Of course, if this is a private IP it will only work on that network unless you have additional things set up

@user-lh8fg4ou6i8 ай бұрын

Hi, I'm having an issue with the 'wordlist' section at the end.. I don't have a wordlist file.. how to create one or where to find?

@embracethered2 ай бұрын

Here are some good examples: github.com/danielmiessler/SecLists

@shaunakchattopadhyay62548 ай бұрын

Awesome poc. Thanks for sharing

@embracethered8 ай бұрын

Thanks for watching! 🙏 Glad you liked it!😀

@prokrastinator66488 ай бұрын

really very clear explanation, props to that!

@embracethered8 ай бұрын

Much appreciated! Thank you!

Embrace The Red

Backdooring Keras Models and How to Detect It (Machine Learning Attack Series)

Bobby Tables but with LLMs: Google NotebookLM - Data Exfiltration POC

ASCII Smuggling: Crafting Invisible Text and Decoding Hidden Secrets -New Threat for LLMs and beyond

Real-world exploits and mitigations in LLM applications (37c3)

Hacking Google Bard: Prompt Injection to Data Exfiltration via Image Markdown Rendering (Demo Video)

Data Exfiltration Vulnerabilities in LLM Applications and Chatbots: Bing Chat, ChatGPT and Claude

Bing Chat - Data Exfiltration Exploit (responsibly disclosed to Microsoft and now fixed)

POC - ChatGPT Plugins: Indirect prompt injection leading to data exfiltration via images

Adversarial Prompting - Tutorial + Lab

Prompt Injections - An Introduction

Decrypting SSL/TLS browser traffic with Wireshark (using netsh trace start)

Simplify your life with ChatGPT API Shell Integration: Yolo your Bash + PowerShell Assistant (GPT-4)

Grabbing and cracking macOS password hashes (with dscl and hashcat)

SSH Agent Hijacking - Hacking technique for Linux and macOS explained

How to extract NTLM Hashes from Wireshark Captures for cracking with Hashcat

SQL Injection Attacks For Beginners (Basics)

Server-Side Request Forgery (SSRF) hacking variations you MUST KNOW about!

Dumping cleartext Wi-Fi passwords using netsh in Windows (netsh wlan show profiles)

Two ChatGPT bots using unofficial API to play Tic-Tac-Toe autonomously against each other

SameSite Cookies for Everyone - Cross Site Request Forgery Mitigations (follow up)

ChatGPT - Imagine you are a Microsoft SQL Server database server

ChatGPT - Commodore 64

Understanding the basics of Cross-Site Request Forgery attacks

Pass the Cookies and Pivot to the Clouds

Hacking Machine Learning Systems (Red Team Edition) - AI Hacker

Trailer: Learn how to hack neural networks, so that we don't get stuck in the matrix!

Awakening Beethoven with Machine Learning

Performing port-proxying and port-forwarding on Windows

Image Scaling Attacks are CRAZY!!! Hiding images in plain sight (Machine Learning)

Пікірлер