This was a fun little experiment. I am quite surprised how well this model came out considering the base checkpoint model has zero information about this particular point of view. That means the whole thing is being interpolated just from the data I provided it. Really speaks to SDXL's flexibility. But alas to get perfect results would require training a whole checkpoint with thousands of these types of images. But you can now have the next best thing. here are the settings I found worked the best. Lora at 1.0 strength, 1024x1024 resolution. Though I did train with bucketing and many other aspect ratios besides 1x1 yield interesting results. lower CFG of about 4 and no more then 5 work best.
Here are some examples of prompt schema that works best. But also check out the metadata of the example images for things like negative prompt.
naked girl laying on the street, busy paris street in background
girl laying on the street, wearing a bikini, busy paris street in background
topless girl laying on the street, busy paris street in background
wonder woman laying on the back patio
and so on. Usually you want to say something like "girl" or "woman" laying in x environment, something happening in background. also tags like "topless" and "fucked by a muscular man" work.