

Do we really need to summarize 36 words down to 27 though?


Do we really need to summarize 36 words down to 27 though?


I am sorry but do goat actually float?


It is like the machine in industrial revolution stayed and flourished, yet the child labor and unfair labor practice are being fight against.
Clearly, this is not a justification for current LLM system or those feeding it. I feel it is important to keep these companies accountable for the crime they have commited.
In the long run, I feel it is also important to think about what do we need to do to keep LLM working for humanity, and organize to make that a reality. If that requires complete removal of LLM as a technology, so be it; but I am not entirely convinced we even need to go that far.


That is the thing about formal proof: if the definition is correct, which usually is relatively short and should be written by human, there is almost no chance of the prove being wrong. The only exception are when the LLM exploits a bug in the proof assistant kernel, and these kernel are usually designed to be exceptionally small, thus making bugs unlikely.
That being said, opus 4.6 found a bug that eventually lead to the proof of false (opus is unable to produce the proof of false, hence unlikely to exploit it): https://github.com/rocq-prover/rocq/issues/21682
However, like I said, the code quality of the llm is usually not on par with an expert, and they have a tendency to produce unnecessary lemmas and complications that will need to be cleaned up by human.
Also, we have a very detailed pen and paper proof, which are designed to be easily translatable to proof assistants. We have also setup all the lemma and theorems to reach the end goal. All of these are done by humans, without these, I don’t believe any LLM can make much progress on this project.
If you just want to avoid U.S. company, you can try mistrialAI.


Yeah LaTeX is a bit different, think about describing a process in MS Word, v.s. writing a program that performs such process on a computer: LaTeX is more like description aimed for human consumption, where as formal proof is more like a program that computers can rigorously execute.
Proof assistant have only attracted the attension of mathematicians very recently, thanks to the organization surrounding MathLib in Lean, and the promption of Terance Tao.
It also rides the AI train quite a bit, as AI have a tendency to confidently be wrong, having a computer to check its proof can be very useful.


There are proof assistants https://en.wikipedia.org/wiki/Proof_assistant that would encode a mathematical proof as code, and verify its correctness for you.
Writing completely formal proof is very painstaking, because it means we will need to flash out a lot and a lot of details (which are mostly trivial for experts) for computers to accept it, and we also need to know how to work with proof assistants.
Human proofs often ignore these details to make it readable, yet also make it more prone to mistakes. Whereas formalized prove in proof assistant can very rarely be wrong (unless there is an unlikely bug in the assistant kernel), but mostly unreadable (unless the proof is incredibly elegant).
So in general, translating good human proof to computer proof requires more expert labor than huge conceptual innovation, yet it usually require the steep learning curve of understanding the ins and outs of a proof assistant, which can take years of experience.
LLM used to be pretty bad at this because even filling in trivial details can quickly derail them. Recently a few flagship coding model are finally able to do this, albeit with a large amount of token consumption in thinking.


Many academics around me have a paid plan of LLM of some sort, most are on $100 plan some are on $200, all of them are getting reimbursed for their plan.
Most of them uses it to optimize code, generate visualization, or formalize pen and paper proof.
I hated it, and don’t use much of it myself. But it seems too useful for these people and it is hard to stop them. As an example, formalizing a pen and paper proof can take an expert weeks, if not month of work, whereas it only takes codex a week.
But I do feel this success is tied to the nature and value of academia, and might not transfer to other fields or industrial projects:


I am not disagreeing or attempting to downplay that academic publishing is still bad in many fields. But there are fields that are now out of the dumper fire, so I sm hopeful that other fields can learn from these and escape.
I also want to highlight the solution that worked is organization, public funding, and academic governance. So if you are unhappy about the situation in the field, maybe it is a good time to organize all your unhappy colleagues and build something new and better :)
Especially 4 of them are just differen peppers.


Not the case with ACM and dagstuhl, but I know other publishers like Springer and IEEE does this. That is why we moved away from them :)


I don’t think that is the case for ACM and Dagstuhl. ACM used to have this ACM open system where department pay a fixed amount subscription per year depends on the department size.
Now that all ACM paper is open access, I don’t know if they are still doing that. Dagstulh never had these, as far as I know, hosting articles are extremely cheap.
These is certainly not the norm everywhere, but our field have already navigated out the swamp of free access, I hope more fields wil.


Yeah, bio and medician had it quite bad…


Yes, there are many field that are still struggling, but nowadays most of, if not all, the articles in my domain is published by ACM and Schloss Dagstuhl, both are academic governed non-profit that are full open access (I don’t think author even have the option to close access.
That being said, fields like medicine, biology, engineering is very much behind. I am very glad my field moved away from publishing with IEEE. They are not necessarily “behind” the entire academia, but certainly way behind my field.


academic publishing. It used to be monopolized by a couple publishing company with unreasonably high fee for access on both the side of researcher and reader.
Now, through hard works of the academics and funding from the public, now many publishing company are non-profit governed by working academics. And in many fields, open access has become the default.


To be completely frank, that is a very disrespectful thing to say. My students are intelligent and capable, and most importantly they are very respectful of other’s time and effort, both their teacher and their peer by choosing to understand the material instead of cheating. They attended school to improve themselves by understanding and absorbing existing human knowledge, not because they are ignorant.


Yes he ordered a Trump phone in order to expose it.


I am confused many people feel filing tax is a useful skill, I have never once used itemized deduction.


I am quite surprised that in the school I am teaching in, student have a much more negative attitude towards genAI than professors, especially in the context of education.
30% of the professor feel that genAI can play a role in education, whereas only 11% of the student holds the same view. That seems to reflect quite well in my homework, only a very small minority (10%) uses some extend of genAI in writing open-ended writing homeworks.
happy cake day!