> This won't be made available to anyone and everyone, but we do believe that responsible SMEs and midmarket companies also need access to these tools in order to identify key vulnerabilities in their systems; not just enterprises.
So this is the same policy that Anthropic and OpenAI have, it is just based on your criteria rather than theirs.
cortesoft
Relevant: https://news.ycombinator.com/item?id=48016224
what's the differnce between this vs running shannon on aws/bedrock fully airgapped in my vpc? I've got some pretty great results with shannon [no subprocessor and can pay via aws credits]. Even better using claude code token [effectively free with our $200/mo cc subscription]
I tried kimi but it generally spins it's wheels extensively in it's thinking tokens. kimi2.7 is an attempt at reducing this. But doing finetuning, means you will always be behind the latest.
as a side note - I think it's very unprofessional and very shitty to not mention kimi2.6 at all in your marketing copy. and i feel that you posted that in this hn post begrudgingly since the hn crowd would have flagged that.
confirmed with a google search too: https://www.google.com/search?q=kimi+site%3Aargusred.com
All around your marketing website you keep mentioning - 'A model lab built it'. A fintune does not maketh you a model lab - some humility please :)
finally - doesn't Kimi's licensing prohibit you from not mentioning them? Didn't cursor run into the same issue?
luminati
IMO the most interesting thing about this is Kimi K2.6, an extremely capable model, can be relatively easily post-trained to allow pen tests.
This in its own right proves that the defenses of Fable and others are temporary blocks, and AI based hacking is going to be effectively available to all parties regardless of stop gaps, as long as open models exist.
jjcm
Fantastic. Could you share more details what it was like post-training a model?
andai
Any generic abliterated or ubcensored open weight model (such as a qwen variant) will happily comply with requests like this.
skiing_crawling
Show HN: We told Claude to generate a marketing page for a theoretical pentesting model
jrflowers
What was your approach to benchmarking an adversarial agent?
This is an open problem that I came across (in a different domain), as the search space can be really wide. It's hard to measure results for non-trivial tasks.
Would be really interested if you can share your eval approach :)
mkaszkowiak
Why create an offensive tool rather than a repo-scanning tool?
I can't think of any way to safely release an offensive tool publicly.
comments (9)
So this is the same policy that Anthropic and OpenAI have, it is just based on your criteria rather than theirs.
cortesoft
as a side note - I think it's very unprofessional and very shitty to not mention kimi2.6 at all in your marketing copy. and i feel that you posted that in this hn post begrudgingly since the hn crowd would have flagged that. confirmed with a google search too: https://www.google.com/search?q=kimi+site%3Aargusred.com
All around your marketing website you keep mentioning - 'A model lab built it'. A fintune does not maketh you a model lab - some humility please :)
finally - doesn't Kimi's licensing prohibit you from not mentioning them? Didn't cursor run into the same issue?
luminati
This in its own right proves that the defenses of Fable and others are temporary blocks, and AI based hacking is going to be effectively available to all parties regardless of stop gaps, as long as open models exist.
jjcm
andai
skiing_crawling
jrflowers
This is an open problem that I came across (in a different domain), as the search space can be really wide. It's hard to measure results for non-trivial tasks.
Would be really interested if you can share your eval approach :)
mkaszkowiak
I can't think of any way to safely release an offensive tool publicly.
Catloafdev
lacoolj