Showing posts with label AI. Show all posts
Showing posts with label AI. Show all posts

Wednesday, January 29, 2025

Does DeepSeek create the image from text specification?

 Conversing with DeepSeek, I was given to understand that it is not supporting creation of images. At present, it is only text.

However, it offered to create an image from text using Python and PIL. As I have been using these libraries, I gave it a try.

Here is the text specification that I used to create an image from Gemini AI [https://hodentekhelp.blogspot.com/2025/01/how-do-you-create-image-using-gemini-ai.html].

"Create an icon with the letter H decorated in yellow, with its insides filled with a web of electronic circuits. The background should be black  23-bit color with 8-bit transparency and the format in png."

Gemini AI created the following image:


Here is the suggestion from DeepSeek:

```python
from PIL import Image, ImageDraw, ImageFont

# Create a blank image with a black background and transparency
width, height = 256, 256  # Icon size
background_color = (0, 0, 0, 0)  # Black with full transparency
image = Image.new("RGBA", (width, height), background_color)

# Initialize ImageDraw
draw = ImageDraw.Draw(image)

# Draw the letter "H"
font = ImageFont.truetype("arial.ttf", 150)  # Use a system font like Arial
letter_color = (255, 255, 0, 255)  # Yellow
draw.text((75, 50), "H", font=font, fill=letter_color)

# Draw a web of electronic circuits inside the "H"
# This is a simplified representation using lines
circuit_color = (0, 255, 0, 255)  # Green for circuits
draw.line((90, 100, 90, 200), fill=circuit_color, width=2)  # Vertical linej
draw.line((90, 150, 160, 150), fill=circuit_color, width=2)  # Horizontal line
draw.line((160, 100, 160, 200), fill=circuit_color, width=2)  # Vertical line
draw.line((100, 120, 150, 120), fill=circuit_color, width=2)  # Horizontal line
draw.line((100, 180, 150, 180), fill=circuit_color, width=2)  # Horizontal line

# Save the image as a PNG file
image.save("icon_H_circuit.png", "PNG")

print("Icon created and saved as 'icon_H_circuit.png'")
```
I ran this code in PyCharm as shown:

from PIL import Image, ImageDraw, ImageFont

# Create a blank image with a black background and transparency
width, height = 256, 256 # Icon size
background_color = (0, 0, 0, 0) # Black with full transparency
image = Image.new("RGBA", (width, height), background_color)

# Initialize ImageDraw
draw = ImageDraw.Draw(image)

# Draw the letter "H"
font = ImageFont.truetype("arial.ttf", 150) # Use a system font like Arial
letter_color = (255, 255, 0, 255) # Yellow
draw.text((75, 50), "H", font=font, fill=letter_color)

# Draw a web of electronic circuits inside the "H"
# This is a simplified representation using lines
circuit_color = (0, 255, 0, 255) # Green for circuits
draw.line((90, 100, 90, 200), fill=circuit_color, width=2) # Vertical line
draw.line((90, 150, 160, 150), fill=circuit_color, width=2) # Horizontal line
draw.line((160, 100, 160, 200), fill=circuit_color, width=2) # Vertical line
draw.line((100, 120, 150, 120), fill=circuit_color, width=2) # Horizontal line
draw.line((100, 180, 150, 180), fill=circuit_color, width=2) # Horizontal line

# Save the image as a PNG file
image.save("icon_H_circuit.png", "PNG")
image.show()

print("Icon created and saved as 'icon_H_circuit.png'")

Here is the image created by DeepSeek:

Well, given the limitations, it did create a H with a minimalist circuit (it decided the color and artwork). 

I really think, the high-end chips are needed for creating something more exotic art work or images. This said, what is the utility of imaginary sceneries with a multitude of colors, etc. 

The basic thing the AI should deliver is VALUE in the first place and in subsequent places.



Monday, January 27, 2025

DeepSeek rattles the US AI dominance. Can it continue to rattle?

 The emergence of DeepSeek AI, a powerful Chinese language model, has sent shockwaves through the US AI industry. Developed with a focus on cost-effectiveness, DeepSeek reportedly functions effectively on lower-end hardware, a stark contrast to US models heavily reliant on high-end chips like Nvidia's. This revelation triggered a significant sell-off in Nvidia stock, highlighting the potential disruption to the current AI landscape.   

https://site.financialmodelingprep.com/market-news/nasdaq-futures-plunge-amidst-concerns-over-deepseeks-impact-on-ai-chip-demand

Last night, I downloaded DeepSeek to take a peek and lo and behold, at first sight, looked as good as the Copilot, Gemini AI, and others I have come across.

Well, what does it lack?

 However, a notable limitation became apparent: DeepSeek lacks robust image generation capabilities. While it can provide code snippets (like Python with Kivy) to generate images, this approach is less user-friendly and may be hindered by the limitations of lower-end hardware in processing and rendering graphics. In contrast, US models excel in not only creating images but also seamlessly modifying them based on simple text prompts.   This is clearly beyond low-end chips.

This development necessitates a renewed focus on innovation and optimization within the US AI sector. US developers must prioritize improving model efficiency and exploring alternative hardware solutions to maintain a competitive edge. While DeepSeek presents a significant challenge, it also serves as a valuable catalyst for further advancements in AI technology.

Sunday, December 15, 2024

Have you considered image augmentation in image data generation - Part 1?

 Python has emerged as a programming language of choice when it comes to generating images through image augmentation techniques. Image augmentation techniques can provide a rich diversity of the training dataset and turbocharge the image recognition models. 

This post explores one of the fundamental image data augmentation techniques, random cropping. A future post discusses the random clipping.

Cropping and flipping are common image manipulation techniques that don't require explicit redefinition. They are often implemented using built-in functions or libraries like OpenCV and PIL.

Details of random cropping:

Here is the python code to randomly crop an image.

=========

def random_crop(img, crop_size):


    h, w = img.shape[:2]

    x = np.random.randint(0, w - crop_size[1])

    y = np.random.randint(0, h - crop_size[0])

    crop_img = img[y:y+crop_size[0], x:x+crop_size[1]]

    return crop_img

============================

The random cropping shown above does not directly use the OpenCV, it uses the functionality of NumPy arrays. 

Note:  Defining h, w =img.shape[:2] correctly picks the images height and width appropriate for NumPy array. The common channels of NumPy are [height, width, and  color-channel] in that order.

Also note:

NumPy arrays use zero-based indexing. This means the first element in an array has an index of 0, the second has an index of 1, and so on.

In the random_crop function:

x = np.random.randint(0, w - crop_size[1])

This line generates a random starting x-coordinate for the crop. w is the width of the image.

crop_size[1] is the width of the cropped region.

By subtracting crop_size[1] from w, we ensure that the starting x-coordinate plus the crop width doesn't exceed the image's width.

y = np.random.randint(0, h - crop_size[0])

This line generates a random starting y-coordinate for the crop.

h is the height of the image. crop_size[0] is the height of the cropped region.

Similarly, this ensures the starting y-coordinate plus the crop height doesn't exceed the image's height.

Random cropping is the same as cropping with randomness introduced to the randomness in choosing the cropping coordinates.

The following code is just regular cropping of an image:

====================

from PIL import Image

def crop_image(image_path, left, top, right, bottom):
"""
Crops an image using PIL.

Args:
image_path: Path to the image file.
left: X-coordinate of the top-left corner of the cropping region.
top: Y-coordinate of the top-left corner of the cropping region.
right: X-coordinate of the bottom-right corner of the cropping region.
bottom: Y-coordinate of the bottom-right corner of the cropping region.

Returns:
The cropped image as a PIL Image object.
"""
try:
img = Image.open(image_path)
cropped_img = img.crop((left, top, right, bottom))
return cropped_img
except Exception as e:
print(f"Error cropping image: {e}")
return None


# Example usage:
image_path = r"C:\Users\hoden\PycharmProjects\exploreImage\Images\TheKiss.jpg"
left = 100
top = 50
right = 300
bottom = 200

cropped_image = crop_image(image_path, left, top, right, bottom)

if cropped_image:
cropped_image.show()
cropped_image.save("cropped_image.jpg")

Now, Let us consider random cropping:

TBC--> TO BE CONTIUED

What are some useful built-in functions of "Random" in Python?

 Random numbers are needed in most studies in information technology such as:

  • Simulation
  • Game Development
  • Machine Learning, AI, Image recognition
  • Cryptography
  • Code testing and many more.

Random module in Python has all the functions. You need to import Random to work with random functions in Python. Using this module you can generate random numbers and sequences. However, the random number generated are pseudo-random numbers, random numbers that are deterministic. In order to generate truly random numbers one has to resort to physical systems like noise, radioactive decay or weather related phenomenon.

Here is a brief description of the functions:

1. random.random():

Generates a random floating-point number between 0.0 (inclusive) and 1.0 (exclusive).

2. random.uniform(a, b):

Generates a random floating-point number between a and b, inclusive of a but exclusive of b.

3. random.randint(a, b):

Generates a random integer between a and b, inclusive of both a and b.

4. random.randrange(start, stop[, step]):

Generates a random integer from the range [start, stop), with an optional step value.

5. random.choice(seq):

Returns a random element from the sequence seq.

6. random.shuffle(x):

Shuffles the elements of the sequence x in place.

7. random.sample(population, k):

Returns a k length list of unique elements chosen from the population sequence or set.

Sequence is a general form of data structure that stores a collection of items in a specific order. It can be of different data types. In simple terms, an ordered list of same type of data is a sequence. 

Here is a better example of a sequence and a list:

Tuples (Immutable Sequences):

  • Ordered collection of elements.
  • Elements cannot be changed after creation.

Often used to represent fixed data structures.

Lists (Mutable Sequences):

  • Ordered collection of elements.
  • Elements can be modified, added, or removed.

More flexible for dynamic data structures.

===============

The random.shuffle(mylist) function modifies the list mylist in place and does not return a new list. This means that you need to have an existing list that you want to shuffle. The function directly changes the order of elements in the original list.

===============

Here is an example of the usage of the randum() function in PyCharm, Randoms.py. If you want use the code to run, remove space between each function shown below.

import random

# Generate a random float between 0 and 1
random_float = random.random()
print("Random Float: ", random_float)

# Generate a random integer between 1 and 10 (inclusive)
random_integer = random.randint(1, 25)
print("Random Integer: ", random_integer)

random_number_with_step = random.randrange(5, 67, 3)
# Generates a random integer from the range [start, stop), with an optional step value
print("Random Number with step: ", random_number_with_step)

random_choice_seq = random.choice([1, 200, 300, 412, 516])
# Returns a random element from the sequence (list as argument)seq.
print("Random Choice Sequence (list): ", random_choice_seq)

random_choice_seq=random.choice('A thing of beauty is a joy forever')
# Returns a random letter from the sequence (string as argument)seq
print("Random Choice Sequence (string): ", random_choice_seq)

random_shuffle_of_the_list=random.shuffle([1, 100, 1000, 10000])
#Shuffles the elements of the sequence x in place
print("Random Shuffle list in place: ", random_shuffle_of_the_list)


#random_Sample=random.sample(population, k)
#Returns a k length list of unique elements chosen from the population sequence or set.
# unlike random.shuffle, random.sample can accept a list variable or use list elements as argument

# Original list
my_list = [1, 100, 1000, 10000]
# Get a sample of 2 elements from the list
random_Sample_Population_k = random.sample(my_list, 2)
# Print a sampled list
print("Random_Sample_Population_k: ", random_Sample_Population_k)
------------------------------------------------------
Being random, each time you run the result will be different except the shuffle list
Random Float:  0.7630766884865164
Random Integer:  1
Random Number with step:  26
Random Choice Sequence (list):  412
Random Choice Sequence (string):  a
Random Shuffle list in place:  None
Random_Sample_Population_k:  [1, 1000]

Sunday, October 13, 2024

Do these libraries PIL and OpenCV produce exactly the same image, or will there be changes in the rendered images?

 In earlier posts we have seen how images can be displayed using PIL and OpenCV.

Now the question is will these two methods look at the same input image and display the same way?

Here is the chosen image in PNG format, I just created using PAINT. The idea is colors may change, but the text will remain same.

----------------------------------------------------

Using OpenCV

import cv2


# Load the image from file

image = cv2.imread(r'C:\Users\hoden\PycharmProjects\exploreImage\Images_2\RGB.png'

# Display the image in a window

cv2.imshow('Loaded Image', image)

# Wait for a key press and close the window

cv2.waitKey(0)

cv2.destroyAllWindows()

-------------

Using PILLow or PILL

from PIL import Image

# Open an image file

image = Image.open(r'C:\Users\hoden\PycharmProjects\exploreImage\Images_2\RGB.png')

# Display the image

image.show()

---------------------------------------------

The images displayed by the two codes are as shown:



Displayed Image: Using OpenCV
Displayed Image: Using PIl

What I found using PyCharm and a chosen image was that there was no difference. They displayed the same way without any changes to color. The only difference is that the image displayed using OpenCV has a slight cut-off in the left side. This may be corrected by placing the image in the center of a predesigned canvas.

What do the AI assistants say:

However, I decided to quiz the AI tools available on the web Co_Pilot, Gemini and Meta AI.

Gemini's answer:

In most cases, these two methods will display the image with minimal to no noticeable differences. However, there are some potential scenarios where there might be slight variations:

1. Color Space:

OpenCV: By default, OpenCV loads images in BGR (Blue, Green, Red) color space.

Pillow: Pillow loads images based on the image format's native color space (usually RGB for common formats).

This difference is usually negligible for human perception, but it's important to be aware of if you're doing further processing that relies on specific color channels. You can convert the color space in OpenCV using cv2.cvtColor(image, cv2.COLOR_BGR2RGB).

note: Looks like it is not the default behaviour.

2. Alpha Channel Handling:

Images with an alpha channel (transparency information) might be handled slightly differently between the libraries. In most cases, both will display the image with transparency preserved, but there could be subtle differences in how the alpha channel information is interpreted.

3. Advanced Image Formats:

Some uncommon image formats may be supported by one library but not the other, leading to different behavior. However, common formats like JPEG, PNG, and BMP should be handled identically.

In conclusion:

For most display purposes, the two methods will produce visually identical results. However, if you need precise control over color space, alpha channel handling, or are working with uncommon formats, understanding the potential differences between OpenCV and Pillow is important.


Co-Pilot's answer:

The two methods of displaying an image using OpenCV and PIL (Pillow) will not produce exactly the same rendered image due to differences in how each library handles image loading and display. Here are the key differences:

OpenCV

Color Format: OpenCV loads images in BGR format by default. This means that the colors might appear different if you don’t convert the image to RGB before displaying it.

Display Window: OpenCV uses its own windowing system (cv2.imshow) to display images. This window is managed by OpenCV and has its own look and feel.

PIL (Pillow)

Color Format: PIL loads images in RGB format by default. This means that the colors will be displayed as expected without any need for conversion.

Display Method: PIL uses the default image viewer of your operating system to display images (image.show()). This means the image will be displayed in whatever application is set as the default for viewing images on your system.

Summary

Color Handling: OpenCV displays images in BGR format, which can cause color inversion if not converted to RGB. PIL displays images in RGB format.

Display Method: OpenCV uses its own window, while PIL relies on the system’s default image viewer



Meta AI's answer:

Do OpenCV and PIL display the same image in the same way without any color differences?

OpenCV and PIL (Python Imaging Library) handle image display differently, which can result in color differences.

*OpenCV:*


1. Default color space: BGR (Blue, Green, Red)

2. Displays images in BGR format

3. Uses cv2.imshow() function for display


*PIL:*


1. Default color space: RGB (Red, Green, Blue)

2. Displays images in RGB format

3. Uses Image.show() or Image.display() functions

Due to this difference in color spaces:

- Images displayed using OpenCV will have a different color representation compared to PIL.


It is being said that the AI agents greatly help in computer coding. I would say, although they may be working from the same databank of information, they need not provide the same answer. The developer has to know stuff beyond just the codes.

Jump to changing color with Python:

https://hodentekhelp.blogspot.com/2024/11/how-do-you-work-with-color-images-in.html

Sunday, September 1, 2024

How do you use AI in WordPress website/blog?

 AI is recently making rounds in all sorts of software. How it is doing, or whether or it makes things better is hard to judge.


Let us say, this is how your Header looks in the Appearances->Editor of your site or blog.

Let us say we want to edit the logo, then we place our cursor on the logo as shown in the picture above.

Choosing the three starts brings up the Jetpack AI Generator which uses the description of the site to create an appropriate logo. If the one suggested by the AI is acceptable, then the existing one will be replaced by the Ai created logo.

Click Generate button and the AI starts working. The image created trying first time is shown in the same picture. The image after the second run appears as shown and gets saved.

It may be possible to change the prompt by clicking enhance the prompt.  It provides a prompt which can be further tweaked. The highlighted words are mine.

"Create a logo for a website catering to condominium owners in Oahu, capturing the essence of island living and community connection in a clean and modern design that reflects a sense of belonging and pride in ownership [with subtle tropical elements] [using a bright and refreshing color scheme] [incorporating a modern and sleek font that contain the letter H]"

I ask for a more vibrant color. Again, one more round of generation yields this.


All these AI generated images gets uploaded to the MEDIA storage.

I am not sure whether the AI generated logo is better than one I have. You be the judge. AI surely generated an image based on the description, which is everything it can do, but not what is in my mind.

For example, I asked for a more vibrant color and I got this. It forgot to add the letter H totally. These lapses in following a thread in an iteration loop is something common to all of the AI's I have had a look at recently.

Did I like it? I think, no. It seems in general, the more vibrant, the more flashy is considered 'amateurish' and the more boring, professional.

More here, https://jetpack.com/ai/



Tuesday, February 27, 2024

How much of the AI Assistants should we believe in?

 

This content is periodically updated and is not static.

Content has been a joint effort by a human and some of the AI assistants. 

I asked a naive but simple question to CoPilot. He was very honest in sharing his opinion related to the question. 

"How do you address flaws in the LLM? Perhaps, you are just a consumer of LLM."

He answered my question which cleared some doubts in my mind.

Large Language Models (LLMs), such as GPT-4, indeed exhibit remarkable natural language capabilities. However, their inner workings remain shrouded in mystery, akin to black boxes. Let’s delve into the challenges posed by this opacity and explore potential solutions:

  1. The LLM Black Box Problem:

    • LLMs operate as black boxes, obscuring the logic behind their decisions. This lack of transparency has safety and ethical implications.
    • Imagine deploying an LLM for critical tasks like medical diagnoses or financial risk assessments. If we don’t understand how it reaches conclusions, can we trust it?
    • The scale and complexity of LLMs exacerbate this issue. Models like GPT-3, with 175 billion parameters, defy easy comprehension.
    • Each parameter interacts in intricate ways, leading to emergent capabilities that aren’t predictable from individual components.
    • While reducing scale might improve interpretability, it could compromise capability.
    • Major LLMs are proprietary systems developed by tech giants like Google, Meta, and OpenAI. Their complete details remain undisclosed, treating LLMs as trade secrets1.
  2. Flawed Decision Making:

  3. Emerging Solutions:

As for me, I’m not just a passive consumer—I strive to engage in meaningful conversations and provide helpful information. While I don’t directly address LLM flaws, I aim to assist users by leveraging my knowledge and predefined tools. 🤖📚


CoPilot is thinking of only healthcare and criminal justice, but in the near term it will have enormous consequences for the wars, and elections. State actors will emerge to make world harmony a distant dream.

It is often a tedium conversing with the CoPilot. Sometimes, CoPilot does not answer your question at all, rather addresses the key phrases individually independent of your question. It feels like you are conversing with a hearing impaired person. It is still a project in the works.

5/18/2024

Gemini AI is now easy to access. It does perform just like CoPilot. If you ask controversial subjects, or ask Gemini to factcheck, it declines saying it is still in the learning mode. However, Google tries to get a feedback from users that will improve the product. This is important because Gemini can return absolutely wrong answers.

7/22/2024

Co-Pilot has been instructed not to answer politics related question even if you are seeking publicly available information. It is in clam-shell mode.

7/22/2024

Of course, you must have heard of the Meta AI o the WhatsApp. It does answer any questions.


Tuesday, September 19, 2023

How trust worthy is AI?

 AI seems to be very promising. It is supposed to make a continental shift type of change. Has that time arrived, yet?

Health care professional, nutritionists, patients of all kinds seek information about what they eat. AI seems to be an ideal tool to get a summarized report of what can be eaten, by who. 

I tried to find about 'Ragi' also known as Finger Millet which has properties superior to rice for its energy and fiber content. I used the easily available Bing AI. I find it awfully lacking in the information it can find.

As I am concerned about diabetes control, I wanted to find nutrition information about 100 grams of Ragi flour.

Here is a picture of what Bing AI (on the right) thinks it is and what the web brings up (on the left).


This is by AI copied from the above:

I apologize for the confusion. You are correct, and I apologize for the incorrect information in my previous response. The values I provided for 10 grams of uncooked ragi powder were mistakenly repeated for 25 grams of uncooked ragi powder. Here are the correct values for 25 grams of uncooked ragi powder:

  • Carbohydrates: Around 0.89 grams.
  • Protein: Approximately 0.13 grams.
  • Potassium: Around 26.66 milligrams.
  • Cholesterol: Cholesterol is found only in animal products, so ragi powder does not contain any cholesterol.
  • Calories: The calorie content of 25 grams of uncooked ragi powder is not available in the search results I found.

I apologize for any confusion caused by my previous response. Please let me know if there’s anything else I can help with!

This is from the web page (Ragi Nutritional Value Per 100g, Finger Millet Health Benefits, Ragi Recipes, Ragi Flour Nutrition Facts | ParentCircle).

Ragi nutritional value (per 100g)

Energy  320kcal

Dietary fiber  11.18g

Total carbohydrate  66.82g

Total fat  1.92g

Protein  7.16g

Total folates  34.66mcg

Vitamin B1 (thiamin)  0.37mg

Vitamin B2 (riboflavin)  0.17mg

Vitamin B3 (niacin)  1.34mg

Vitamin B6 (pyridoxine)  0.05mg

Vitamin K  0.9mcg

Calcium  364mg

Iron  4.62mg

Magnesium  146mg

Phosphorus   210mg

Potassium   443mg

Manganese   3.19mg

Zinc  2.53mg

These is a huge difference in carbohydrate value between these.


Friday, June 19, 2020

What are the uses of the Python Language?

You may want to read this link to learn the use cases of Python.

Insurance: creating business insights with machine learning
Retail banking: flexible data transformation and manipulation
Aerospace: meeting software system deadlines
Finance: data mining to identity cross-sell opportunities
Business services: API access to financial information
Hardware: automating network administration
Healthcare: predicting disease prognosis
Consulting services: bespoke web development
InfoTech: IT modernization
Software: adding extensibility to legacy applications

Source:
https://www.activestate.com/resources/datasheets/top-10-python-use-cases/


Python is also used for backend in Web Applications where it is used for scripting
2) Desktop GUI Applications.
3) Software Development. ...
4) Scientific and Numeric. ...
5) Business Applications. ...
6) Console Based Application. ...
7) Audio or Video-based Applications. ...
8) 3D CAD Applications.

For more applications read here:
https://intersog.com/blog/python-application-development-common-use-cases-and-project-examples/

Read all Python-related posts in this blog:
https://hodentekhelp.blogspot.com/search?q=python

Sunday, September 23, 2018

What is SPSS?

Data Science is trending. What with Machine Learning and Artificial Intelligence, learning to work with statistical analysis tools will take you far.

Here is an IBM one liner for SPSS:

"Propel research & analysis with a fast and powerful solution"

SPSS is an acronym for Statistical Software for Social Sciences. Its beginnings dates back to 1968 and it was acquired by IBM in 2009. Files created by this software have the extension .sav. SPSS outputs tables and charts that can processed by MS Word, Google Docs, Open Office, etc.

SPSS can be used for editing and analyzing data similar to other software such as R, Excel, Python, etc.

Here is a PR demo from IBM




More SPSS stuff here:

https://hodentekhelp.blogspot.com/2018/09/is-spss-software-free.html

https://hodentekhelp.blogspot.com/2018/09/how-do-you-read-spss-file-using-r.html

Monday, July 9, 2018

What are Geography and Stock data types in MS Excel?

Stocks and Geography are the new data types in MS EXCEL Office 365.

I think you must have specific versions of Office 365 for you to work with these new data types.

Microsoft is excited but I am not, as I may have to pay up for (or upgrade subscription) something I may only use to see what it can do.

The nice thing about these are they are AI Powered and they search things out of Microsoft Knowledge Graph, the service that powerss their search engine, Bing.

What it does is it has a context and the search is in this context.

These are shots of Excel sheet with the Geography data type from the above link.