Monday, July 22, 2013

Making Web Application In PHP Using UTF-8

According to Wikipedia, UTF-8 (UCS Transformation Format—8-bit) is a variable-width encoding that can represent every character in the Unicode character set. It was designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in UTF-16 and UTF-32.

Our Aim is to make a simple web application where we will have a user end form, from which user will enter both normal and Unicode characters. Then, there will be a Database where the data from form will be stored. And finally, we will have a report page for displaying data.

So, creating a form is very simple. Nothings extra needs to be done with form elements in this case. Now, lets move on to the database part. Just one change needs to be done with table fields. The Collation Type of the field that will contain Unicode characters should be set to utf8_bin.

I have created a table named test with 3 fields - id, name and status.

CREATE TABLE IF NOT EXISTS `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `status` int(11) NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`)
)

Now, next step is to insert form data into table. This will also be normal as we always do. Just run a insert command and insert the posted values from form.
 
INSERT INTO `test` (`id`, `name`, `status`) VALUES
(1, 'सजल', 1),
(2, 'सूर्या', 1),
(3, 'मनीष', 1),
(4, 'राहुल', 1);
 
In the above sample, in the name field, i am passing Unicode characters. Now the final task is to display the records in webpage. This is also quite very simple. we just need to make two additions. These are :

1) Use Meta Tag

Short tag :
Long Tag:

OR
You can use php header to set it -
header('Content-Type: text/html; charset=utf-8');

2) Sets the Mysql client character set

mysql_query("SET NAMES 'utf8'");
mysql_query("SET CHARACTER SET utf8");
mysql_query("SET COLLATION_CONNECTION = 'utf8_unicode_ci'");

Use the above command before firing select command. SET NAMES indicates what character set the client will use to send SQL statements to the server. It is needed whenever you want to send data to the server having characters that cannot be represented in pure ASCII, like 'ñ' or 'ö'.

This is all for it. Following the above, you can make a dynamic website in your native language. Queries and comments are most welcome. I would be happy to help.

No comments :

Post a Comment