GLM-4.7 Is Getting Stronger

Hi everyone, happy weekend.

It is near year end and I wanted to take it easy, but this week had many model updates in China. As someone who focuses on evaluations, I felt guilty for not covering GLM-4.7 yet.

I have been writing code with GLM-4.7 for a few days already.

I also saw on X that GLM-4.7 is ranked first among domestic open source models on Artificial Analysis, so I wanted to share a few thoughts.

Artificial Analysis

First, what is Artificial Analysis.

Website: https://artificialanalysis.ai/

It is an independent evaluation group. What I like is how comprehensive the benchmarks are. They cover GPQA, HLE, AIME 2025, LiveCode Bench, terminal-bench, and more across reasoning, knowledge, math, code, and tool use. They also include pricing, speed, and latency, which are important but easy to overlook.

Overall, it is a solid reference.

I looked through the sub rankings and noticed a few highlights.

The first is tool use.

A key test for an agent is whether it knows when to call tools in multi step tasks. Tau^2 Bench Telecom measures multi turn dialogue and tool use in telecom scenarios.

GLM-4.7 is now top tier on tool use.

For coding, GLM-4.7 is also strong.

It is just below Gemini, and ranks first among open source models. That is a good signal for coding quality.

There is also an interesting benchmark called Vending-Bench.

It simulates running a vending machine business over a full year and measures profit. It is a practical test for long running task stability.

GLM-4.7 is the first open source model to make a profit, and it even earned more than GPT-5.1.

That is enough about benchmarks. Here are my own takeaways from using it.

My takeaways

First, front end aesthetics improved a lot.

It matters. We are visual. Chinese models have lagged here, but GLM-4.7 is a big jump.

For example, I asked it to build a page about a 5090 GPU. The prompt was simple: “Generate a page about 5090.” It chose the colors on its own, and the palette is close to Nvidia.

Here is another common infographic scenario. GLM-4.7 handled both content and design well. The prompt came from @shaomeng and the content was from the official GLM-4.7 blog.

It even rendered the original image and the table.

I also tried a classic bouncing heart animation that was popular two years ago.

One shot, great result.

The official team also shared a case that combines gestures and cards.

Second, Skills support.

This is important. With Skills, you can extend functionality a lot. For example, I used an image generation skill so GLM can do image tasks it could not do before.

I also used a skill to fetch basic information from a YouTube video.

The results are accurate. With Skills support, there is another reason to consider replacing Claude.

Third, code review and bug finding improved.

I used GLM-4.7 not only for coding, but also for debugging and analysis.

With chrome-devtools MCP, it did very well.

Closing thoughts

Looking back at the end of 2025, the feeling is strong. Last year we were still wondering when we could catch up to Claude 3.5, and we worried about bugs from local models.

This year, from DeepSeek in the first half to GLM in the second, we were not just observers. We were part of the wave.

Look at the overseas community comments. They are genuine praise.

I have to say it, this is impressive.

We used to look across the ocean at a sky that felt out of reach. Now we are part of that sky for others.

A father supports a child learning to ride a bike. The child keeps looking back, afraid the father will let go. The father says, do not look back, look forward, then you can ride straight. The key is not who holds you, but how far you can go.

We are at a similar moment. In the past, we assumed the strongest models belonged to others. Now GLM-4.7 writes its own name near the top of the leaderboards.

We can finally ask what future we want to build with our own models.

That is the biggest gift of 2025: a shift in mindset toward independence.

As 2025 ends, the days of only looking at the stars are over. GLM-4.7 proves a simple point: if you keep moving forward, you become the sky.

Thank you for reading.

If this helped, please like, share, and follow. Do not miss updates and consider starring the account.