• 29 Posts
  • 21 Comments
Joined 1 year ago
cake
Cake day: June 26th, 2023

help-circle

  • To a degree, my question is, how do you feel about others being able to generate content, especially when it is limited in flexibility and quality.

    Also, I’m curious if you see the real potential market if you flipped the perspective, adopt the tech, and use it to your advantage. Maybe it is layering and backgrounds for composition, maybe it is full on training to generate content, or maybe it is simply maximizing time by allowing the AI to rework images.

    Like the typical image generation process most people think about turns a text prompt into an image using an image consisting of mathematically random noise and turning it into a version of the prompt in a series of steps. There are other methods too. One method takes an image as input, overlays some noise, and then uses this as the baseline to generate an image from. Basically a blurry or bad image can add just a small amount of noise and the AI can render it better. This isn’t like photo filters or editing. I would be using this to my advantage. I would also look very carefully at what is hard to generate with AI rn and focus on making stuff that it cannot do well. There is a lot more generated content than I thought before I learned how this works and what AI does poorly.


  • Honestly, may I ask, how do you perceive this?

    I have used images to help me learn how training works with AI. It is far easier to see ass nipples are a mistake than it is to see that poor text training has resulted in a middle aged woman with excessive hairiness and a passion for gardening is now going by the name Harry Potter.

    I may have a database of images and trained models that I have used to learn, not your content in particular, and not any particularly good results. I’ve mostly explored why labias are so bad with stable diffusion, and scrapped a couple of ftv galleries. I wouldn’t call myself a fan of anyone really. I’m certainly not a mark in this space. My real interest is in other AI applications. Posting trained models of people seems too gray area for me. At the same time, this is becoming a super powerful tool that essentially expands exposure and likely attracts the type of person that would pay for more. Like the recent creation of Open Dream makes it possible to do image layering for complex composition. I’m curious about a content creator’s take here.


  • It isn’t too hard to read the way the scripts parse prompts. I haven’t gone into much detail when it domes to stable diffusion. The GUIs written in gradio, like Oobabooga for text or Automatic1111 are quite simple python scripts. If you know the basics of code like variables, functions, and branching, you can likely figure out how the text is parsed. This is the most technically correct way to figure this stuff out. Users tend to share a lot of bad information, especially in the visual arts space, and even more so if they use Windows.

    Because the prompt parsing method this is part of the script. If we don’t know what software you are using, it is hard to tell you what to do with certainty. I think most are compatible, bit I don’t know for sure. In the LLM text space, things like characters are parsed differently across various systems.

    With Automatic1111, on the text2img page, there is a small red icon under the image that opens up a menu in the GUI and lists all the LoRAs you have placed in the appropriate folder for LoRAs on your host system where you installed A1111. Most of the LoRAs you download that show up on the text2img page will have a small circled “i” icon in one corner, this will usually contain a list of the text data that was used to train the LoRA. This text data was associated with each image. These are the keywords that will trigger certain LoRA attributes. When you have this LoRA menu open, if you click on any of the entries, it will automatically add the tag used to set the strength of the LoRA’s influence on the prompt. This defaults to 1.0 but this is always too high. Most of the time 0.2-0.7 work okay. You also need the main key word used to trigger the prompt added somewhere in the prompt. This can be difficult to find unless to keep this information from the place you downloaded the LoRA from. Personally, I rename all of my LoRAs to whatever the keyword is. Also, you’re likely going to get a lot of LoRAs eventually. Get in the habit of putting an image relative to what each LoRA does in the LoRAs folder. The image should be named the same as the LoRA itself. A1111 will automatically add this image to each entry you see in the GUI menu. LoRAs are not hard to train too. Try it some time. If you can generate images, you can train LoRAs.


  • It is easy to have too many cooks in the kitchen, but that is an easy problem to solve. Model decay is not a real problem if you understand how a LLM works. Overtraining is like burning a big dinner and ruining a meal. One doesn’t stop cooking forever, or burn down the house and quit. You just cook another meal next time. If your model has 100 trillion tokens, you’re likely to try your very best to salvage your massive ruined dish, but in the end, it doesn’t matter. You can easily tweak the recipe for next time. Models have no persistent memory. Context can be used to train and turned into data, but it is a totally separate thing that is unrelated to the model itself. As an oversimplification, a LLM is just a large database of categories mixed with a massive amount of language data that enables a statistical calculation of what word should come next. This is a simple prediction of what word comes next. Everything else is censoring algorithms and illusions embedded into how humans use language. Really, thus is a tool to access culture through language, and in the case of larger models, the culture embedded into many different human languages.

    This is as much of a “fad” now as the internet was in the late 90’s, and this is on par with that change. LLMs are no fad. This is a tool as disruptive as the public internet. For instance, in 10 years, Google will be a relic of the past. AI will completely replace it. Education will also completely change. It is possible to have entirely individualized education. Physiology will change as a LLM can be tuned to address and help with many human social issues. This will change everything because it exists I’m the open source space already.










  • { “seed”: 3380144426, “prompt”: “Sexy beautiful woman nude, bright blue eyes, blonde hair, light blue, striking features, intricate details, dramatic composition, large breasts, skinny, small, wispy blonde hair, makeup, contrast, texture, realism, high-quality rendering, stunning art, high quality, film grain, Fujifilm XT3, acne, detailed skin, freckled, (((July 4th body paint))), nude naked, at a large block party, on the street, at night, in the middle of the crowd, people in the background, fireworks, lots of fireworks, big party”, “negative_prompt”: “child, kid, NG_DeepNegetive_V1_75T, (((multiple heads))), (greyscale:1.2), disabled body, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, glans, open mouth, cross-eyed, blurry, head out of frame, 3D render, cartoon, anime, rendered, frizzy hair, permed hair, fake, drawing, extra fingers, mutated hands, mutation, mutilated, deformed, extra limbs, 3D, 3DCG, cgstation, dress, blouse, skirt, panties, pants, shorts”, “cfg_scale”: 9, “width”: 512, “height”: 768 }






  • Is anyone here using SD with Blender or for stuff other than NSFW too? Looking to see if a laptop embedded RTX 4050 or 4060 is viable? What is the real limit here? Like can anyone tell me something like “8gbv just can’t do (XYZ)” or the iteration time becomes so long the workflow is not practical/sw crashes. Anyone running Linux on a 4050/4060? What kind of impact is there with storage speeds, what size of DDRx is used in practice?


  • These are alternates for this post and notes for a few others that were not worth posting on their own. These notes and posts may be silly if your perspective is someone hosting your own software where iteration takes a few seconds. For someone online, on a rate limited account, the perspective is very different.

    This is what “tan lines” really does on this instance. It just doesn’t work and becomes clothing like tan bikini bottoms.

    This is (very loosely) “leaning back with arms behind back resting on elbows” taken WAY out of context.

    Crazy, broken looking results from trying to force a better labia than this instance is capable of producing.

    (Loosely)“Fishnet stockings and heels only, with a pink pussy.”