训练Midjourney的prompt
📣 流程:可以复制每一步,按照下面步骤跟gpt聊下去
原理:把mj的官网的说明书喂给gpt,让它根据说明一步步的了解机制和结构,给出适合的提示词。
tips:如果mj的官网说明更新了,大家可以自主替换
1——————————————————————————————————————
我将使用一个Diffusion Model模型去生成一张图片或照片。现在我提供给你关于这个模型的资料,回答是否可以吗?
2——————————————————————————————————————
这是Midjourney的工作原理介绍:
Midjourney is an Al image generation tool that takes inputs through text prompts and parameters and uses a Machine Learning (ML) algorithm trained on a large amount of image data to produce unique images. is powered by Latent Diffusion Model (LDM), a cutting-edge text-to-image synthesis technique. Before understanding how LDMs work, let us look at what Diffusion models are and why we need LDMs.
Diffusion models (DM) are transformer-based generative models that take apiece of data, for example, an image, and gradually add noise over time until itis not recognizable. From that point, they try reconstructing the image to its original form, and in doing so, they learn how to generate pictures or other data.
The issue with DMs is that the powerful ones often consume hundreds of GPU days, and inference is quite expensive due to sequential evaluations. To enable DM training on limited computational resources without compromising their quality as well as flexibility, DMs are applied in the latent space of powerful pre-trained autoencoders.
Training a diffusion model on such a representation makes it possible to achieve an optimal point between complexity reduction and detail preservation, significantly improving visual fidelity. Introducing a cross attention layer to the model architecture turns the diffusion model into a powerful and flexible generator for generally conditioned inputs such as text and bounding boxes, enabling high-resolution convolution-based synthesis.
先不用多做回复,请问答是否收到即可?
3——————————————————————————————————————
Version
Midjourney routinely releases new model versions to improve efficiency, coherency, and quality. The latest model is the default, but other models can be used using the --version or --v parameter or by using the /settings command and selecting a model version. Different models excel at different types of images.
Newest Model
The Midjourney V5 model is the newest and most advanced model, released on March 15th, 2023. To use this model, add the --v 5 parameter to the end of a prompt, or use the /settings command and select MJ Version 5. This model has very high Coherency, excels at interpreting natural language prompts, is higher resolution, and supports advanced features like repeating patterns with –tile.
What's new with the V5 base model?
-Much wider stylistic range and more responsive to prompting
-Much higher image quality (2x resolution increase) improved dynamic range
-More detailed images. Details more likely to be correct. Less unwanted text
-Improved performance with image prompting
-Supports --tile argument for seamless tiling (experimental)
-Supports --ar aspect ratios greater than 2:1 (experimental)
-Supports --iw for weighing image prompts versus text prompts
Style and prompting for V5
-Today's test is basically a 'pro' mode of the model.
-lt's MUCH more 'unopinionated' than v3 and v4, and is tuned to provide a wide diversity of outputs and to be very responsive to your inputs.
-The tradeoff here is that it may be harder to use. Short prompts may not work as well. You should try to write longer, more explicit text about what you want (ie: "cinematic photo with dramatic lighting")
-Please chat with each other in prompt-chat to figure out how to use v5
-We hope to have a 'friendly' default styling for v5 before we switch it to default. When this happens we will still let you turn it off and get back to something like this 'raw' mode today.
Please note
-This is an alpha test and things will change. DO NOT rely on this exact model being available in the future. lt will be significantly modified as we take V5 to full release.
-Right now there is no V5 upsampler, the default resolution of V5 is the same as upscaled V4. lf you click upscale it will just instantly give you that one image by itself.
Community Standards:
-This model can generate much more realistic imagery than anything we've released before.
-We've increased the number of moderators, improved moderation tooling, and will be enforcing our community standards with increased strictness and rigor. Don't be a jerk or create images to cause drama.
More about V5:
V5 is our second model trained on our Al supercluster and has been in the works for 5 months. lt uses significantly different neural architectures and new aesthetic techniques. V5 isn't the final step, but we hope you all feel the progression of something deep and unfathomable in the power of our collective human imagination.