4G video memory with low configuration to play AIGC! The new work of the author of ControlNet reached the top of the GitHub hot list

Article Source: Qubit

Text: Cressy Xiaoxiao

Play AI Painting, now you don't have to worry about being "backstabbed" by Old Huang knife skills!

All you need is a GTX 1650 (4GB video memory) from 4 years ago, and the AI rendering effect is comparable to the current best open source model SDXL.

** ** Source: Twitter @ナビ

This is a new project Fooocus that has dominated the TOP 1 list on GitHub for several days in a row. It has been marked with 4K in three days. It is produced by ControlNet author.

Before it came out, if you want to run the latest XL model of Stable Diffusion smoothly, you have to use 4060ti with 16G video memory (the original 3060 with 12G video memory is very reluctant to run).

Unlike other open source AI tools, Fooocus "focuses on the generation itself", not only has low hardware requirements, but also is easy to use, very novice friendly——

There is no need to adjust any parameters in the whole process, just click the mouse, and an image can be generated in 3 steps.

** ****△**Picture source Twitter @Photogenic Weekend

Some netizens called out, "This is simply the culmination of Stable Diffusion and Midjourney":

Say goodbye to manual tuning! Offline, open source and free, just prompt words and pictures and let the magic happen!

Some netizens lamented: Even Xiaobai can give full play to the effect of the Stable Diffusion XL model.

So, what is the actual generation effect of this brand new image AI tool? We tried it out.

Colab draws the picture in half a minute, the effect is comparable to SD

From the perspective of the running interface, Fooocus has more than one hundred built-in styles to choose from.

** ****△**Picture source Twitter @camenduru

As for efficiency, Fooocus is also very fast in drawing. In Colab, it takes about half a minute to draw a picture in speed priority mode:

The time displayed in the log is the drawing time, but there was a process of text parsing before, which took about 40 seconds in total:

** **### The picture has been accelerated

Then let it draw a caricature first, and see what the "Mazar War" will look like from the perspective of AI. (Not this Mazaha)

Due to the direct generation of portraits with AI, there are still some problems with the handling of hands, so we simply let Musk and Zuckerberg wear gloves:

The effect seems to be pretty good. I don't know if they have any bets, but it might as well let the loser come to dress up.

(Reminder: There is no winner in a fight)

In the end, the two "shake hands and make peace", and this precious scene was also recorded by the photographer. Does the overall picture have that flavor?

After the "Battle of Mazar" ended, Lao Ma obediently returned to the company and sold Tesla.

If you ignore the LOGO, the design sense of poster is quite online.

In fact, every built-in style of Fooocus is very interesting, so let's take a look at these live pictures of different styles:

When it comes to imitating famous works, there's a Cyberpunk version, a Zelda version, a Minecraft version, and even a Pokémon version of Musk to watch.

As for other art forms, there are Pixel and Lowpoly styles, as well as Nendoroid and Scissorial versions…

Of course, there are endless examples, readers and friends of more styles can experience it by themselves.

(It has to be said that the text in the AI drawing is finally not a ghost drawing)

Are the great painters already impatient and want to try it? We will introduce how to play Fooocus in a moment!

The interface of Fooocus belongs to Aunt Jiang, which can be said to be very concise:

If you are just trying something new and have no special requirements, this prompt box is completely enough.

Because the author has internalized many complex skills into the program, these operations of tuning parameters no longer need to be done manually.

Enter directly in the box below, click the generate button and wait for the drawing to be produced.

(By default, two pictures are output at a time, the size is 1152×896, the style is cinematic default, and the speed is prioritized)

If advanced settings are required, tick Advanced in the lower left corner, and the configuration information will appear on the right side of the page, divided into three tabs:

Things that can be adjusted include size, quantity, style, performance and more.

If you are a professional player, you can also choose the model version and even adjust the LoRA parameters.

In addition, there is a high-end game of adjusting sharpness.

For the same content, the following GIF shows the change of sharpness from 2 to 10 and then to 20. It can be seen that as the sharpness increases, the details of the picture become more and more abundant:

However, as for whether Fooocus supports Chinese, we have also tried it, but it is a pity that it is not yet available.

For example, we input the prompt word "apple", and the result is a girl.

This... is it trying to say "You're the apple of my eyes"?

Now you probably know how to play Fooocus, so how to configure it?

If you have a Windows machine with an Nvidia graphics card, you can use the out-of-the-box version. (Probably this is the 114514th time Lao Huang has won mahjong)

At the same time, the hardware needs to meet the minimum configuration requirements - 4GB video memory + 8GB memory.

Download directly from here first:

After decompression, double-click run.bat to run, the system will automatically download the model and deploy it, and it can be used after the configuration is completed.

The configuration requirements of the Linux version are the same as those of Windows, but the configuration process is more complicated.

(If you have Jupyter, you can also refer to the note file used in Colab)

First, install the environment dependencies:

git clone cd Fooocus conda env create -f environment.yaml conda activate fooocus pip install -r requirements_versions.txt

Then download the model file and store it in the specified directory:

** ** For details, please refer to the GitHub page

Of course, you can also let the system automatically download the model:

python launch.py

If you are using a Mac, or the hardware configuration does not meet the requirements, you can also run it directly with Colab.

(Portal:

However, what I still have to complain about is that the Colab version will crash several times from time to time, either automatically stopping or memory overflow...

If you want to run Fooocus more smoothly on Mac or A card computer, you can wait for the author's update.

On the whole, Fooocus's image output effect is good. If the prompt word is selected well, it can even be used as Stable Diffusion. The key is that the hardware configuration requirements are not high.

How on earth is this possible?

From the latest project of the author of ControlNet

In terms of architecture design, Fooocus is mainly divided into two parts: the interactive interface and the AI model.

Among them, interactive interface refers to two projects, namely stable-diffusion-webui and ComfyUI.

stable-diffusion-webui is mainly the front-end design of the interactive interface:

ComfyUI has both GUI and back-end design of Stable Diffusion:

As for the AI model, it can be seen that the new SDXL model of Stable Diffusion is used:

This is currently one of the best versions of Stable Diffusion, and the generation effect has been improved a lot compared to the previous version 1.5.

However, although the model and UI design of Fooocus refer to the ready-made Stable Diffusion open source project, the author incorporated a lot of his own optimization design when making it, making the model run smoother.

For example, the author carefully adopted the advanced k-diffusion sampling method designed by himself in Fooocus, which can improve sampling continuity, reduce performance loss, and improve sampling efficiency;

In addition, the author also carefully adjusted the parameters of the sampler (Sampler), and modified and added some new settings including movie style on the basis of the original version.

The reason why the LoRA option is added to Fooocus is that the author found that the SDXL model with LoRA (weight setting less than 0.5) is almost always better than the SDXL model without LoRA.

The author who developed the Fooocus project is named Lvmin Zhang. He graduated from Soochow University in 2021 and is currently a PhD student at Stanford University.

Several projects he has done, including ControlNet and style2paints, almost all of them exploded:

Now, the latest project, Fooocus, looks to be equally popular.

On social media, some netizens have spontaneously compiled the Fooocus version of a collection of prompt words in different styles Excel.

If you don't know what kind of picture to generate, just refer to the prompt words in this document:

Have you figured out what kind of images you want to generate with Fooocus?

project address:

Reference link: [1] [2] Collection of prompt words: [3] [4]

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments