A great deal of work is going into this area. In fact, I believe there’s quite a few parties using LLMs to look for security bugs, and the US Department of defense had a multimillion dollar competition to motivate just that.
Doesn’t the new Chinese model just released actually do abstract reasoning?
DeepSeek-R1 leverages a pure RL approach, enabling it to autonomously develop chain-of-thought (CoT) reasoning, self-verification, and reflection—capabilities critical for solving complex problems.
With chain of thought it basically asks itself to generate related sub questions and then answers for those sub questions.
Basically it’s just the same but recursive. So, like it looks like it can tell you things, it just also looks like reasoning.
Now it may well be an improvement, but it’s still basically. “I have this word, what is statistically most likely to be the next word” over and over again.
I’d like to see them hire some formal methods people to at least formally verify crucial parts of it.
It might actually also be good to analyze it with an LLM to identify any hidden problem areas.
I’m interested to hear why my idea is probably foolish as well, though.
A great deal of work is going into this area. In fact, I believe there’s quite a few parties using LLMs to look for security bugs, and the US Department of defense had a multimillion dollar competition to motivate just that.
llms have no abstract reasoning, so while they can write an okay-sounding bug report, often it’s wrong meow.
i do think the linux foundation hires security people, and almost certainly the big contributors do.
Doesn’t the new Chinese model just released actually do abstract reasoning?
To my untrained self, that sounds like reasoning.
With chain of thought it basically asks itself to generate related sub questions and then answers for those sub questions.
Basically it’s just the same but recursive. So, like it looks like it can tell you things, it just also looks like reasoning.
Now it may well be an improvement, but it’s still basically. “I have this word, what is statistically most likely to be the next word” over and over again.
Thanks.
Edit: Not sure who’s downvoting me for asking reasonable questions.