I have been defaulting to using GitHub Copilot for reviewing PRs, after all, GitHub runs it automatically and it doesn’t seem to cost anything.
But in the last few reviews, I started to doubt what it actually “understands.”
For example, a very basic issue: it still considers 1.82.0 to be higher than 1.91.1, which is a common version number comparison error from early large models.
If this were a model problem, it would also think that rust 1.91.1 has not been released yet, which reveals that the agent’s retrieval and real-time status judgment capabilities are also lacking.
Another bigger issue is: Copilot’s review is clearly based on individual files.
It’s okay for checking code style and boundary conditions, but it lacks a global perspective. For instance, in one PR, the agent miscalculated relative paths and copied the same file multiple times, when in fact only one copy is effective — it completely failed to notice this, and didn’t even care about what the original issue associated with the PR was asking for.
In my opinion, a qualified code reviewer agent should primarily evaluate from a global perspective:
Does the PR satisfy the issue, align with the project goals, and have a reasonable file structure and architecture? Only then should syntax and detail issues be considered.
Recently, I’m planning to add a reviewer mode to Holon. Are you all really using reviewer agents now? What do you usually use?
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
I have been defaulting to using GitHub Copilot for reviewing PRs, after all, GitHub runs it automatically and it doesn’t seem to cost anything.
But in the last few reviews, I started to doubt what it actually “understands.”
For example, a very basic issue: it still considers 1.82.0 to be higher than 1.91.1, which is a common version number comparison error from early large models.
If this were a model problem, it would also think that rust 1.91.1 has not been released yet, which reveals that the agent’s retrieval and real-time status judgment capabilities are also lacking.
Another bigger issue is: Copilot’s review is clearly based on individual files.
It’s okay for checking code style and boundary conditions, but it lacks a global perspective. For instance, in one PR, the agent miscalculated relative paths and copied the same file multiple times, when in fact only one copy is effective — it completely failed to notice this, and didn’t even care about what the original issue associated with the PR was asking for.
In my opinion, a qualified code reviewer agent should primarily evaluate from a global perspective:
Does the PR satisfy the issue, align with the project goals, and have a reasonable file structure and architecture? Only then should syntax and detail issues be considered.
Recently, I’m planning to add a reviewer mode to Holon.
Are you all really using reviewer agents now? What do you usually use?