views:

245

answers:

3

I am currently writing my truly first PHP Application and i would like to know how to project/design/implement MySQL Views properly;

In my particular case User data is spread across several tables (as a consequence of Database Normalization) and i was thinking to use a View to group data into one large table:

CREATE VIEW `Users_Merged` (
name,
surname,
email,
phone,
role
) AS (
SELECT name, surname, email, phone, 'Customer'
FROM `Customer`
)
UNION (

SELECT name, surname, email, tel, 'Admin'
FROM `Administrator`
)
UNION (

SELECT name, surname, email, tel, 'Manager'
FROM `manager`
);

This way i can use the View's data from the PHP app easily but i don't really know how much this can affect performance.

For example:

SELECT * from `Users_Merged` WHERE role = 'Admin';

Is the right way to filter view's data or should i filter BEFORE creating the view itself? (I need this to have a list of users and the functionality to filter them by role).

EDIT

Specifically what i'm trying to obtain is Denormalization of three tables into one. Is my solution correct? See Denormalization on wikipedia

+2  A: 

In general, the database engine will perform the optimization for you. That means that the engine is going to figure out that the users table needs to be filtered before being joined to the other tables.

So, go ahead and use your view and let the database worry about it.

If you detect poor performance later, use MySQL EXPLAIN to get MySQL to tell you what it's doing.

PS: Your data design allows for only one role per user, is that what you wanted? If so, and if the example query you gave is one you intend to run frequently, make sure to index the role column in users.

Larry Lustig
Yes, User Role is disjointed by design; the fact is that the role column exists only in this view (I need this column for filtering), how can i index it? I'm not sure about this, but can a view in mysql have indexes?
Gianluca Bargelli
MySQL will use the index on the TABLE to do the selection. In general, it is a mistake to try to out think your DBMS. These things are pretty complicated under the hood and all the simple cases are already optimized. Just index the likely columns and then don't worry about performance until and unless you see poor performance.
Larry Lustig
MySQL does not optimize views, plain and simple. You should worry about your views just as much as you worry about regular queries - no more, no less.
B T
@Larry Lusting I am almost there, even if i don't really get the idea to optimize only and only if i see bad performance, can you kindly elaborate on the subject?
Gianluca Bargelli
Build your tables. Index the columns that you are going to search on. Don't worry about the views, the DBMS will optimize those queries based on your table structure and index. Do not worry about the speed of the views unless you notice a problem with their performance in testing. For all except the largest databases and most complex queries, the performance will be fine.
Larry Lustig
I'll take this as the answer to my question, thanks!
Gianluca Bargelli
+1  A: 

If you have <1000 users (which seems likely), it doesn't really matter how you do it. If the user list is unlikely to change for long periods of time, the best you can probably do in terms of performance is to load the user list into memory and not go to the database at all. Even if user data were to change in the meantime, you could update the in-memory structure as well as the database and, again, not have to read user information from the DB.

Tomislav Nakic-Alfirevic
Thanks for your answer but i am searching for a more generical approach on this.
Gianluca Bargelli
For a more generic approach, use an Object-Relational Mapper and reap the benefits: declarative performance tuning, all kinds of protection (sql injection being the obvious example), read-write support (as opposed to the read-only pattern were discussing here), database independence (a Very Good Thing!) etc.
Tomislav Nakic-Alfirevic
Very interesting indeed! I'll try to search something on the subject!
Gianluca Bargelli
A: 

You would probably be much better off normalizing the Administrators, Users, Managers and what-have-you into one uniform table with a discriminator column "Role" that would save a lot of duplication, which is essentially the reason to do normalization in the first place. You can then add the role specific details to distinct tables that you use with the User table in a join.

Your query could then look as simple as:

SELECT
   `Name`, `Surname`, `Email`, `Phone`, `Role`
FROM `User`
WHERE 
    `User`.`Role` IN('Administrator','Manager','Customer', ...)

Which is also easier for the database to process than a set of unions

If you go a step further you could add a UserRoleCoupling table (instead of the Role column in User) that holds all the roles a User has per user:

CREATE TABLE `UserRoleCoupling` (
    UserID INT NOT NULL,  -- assuming your User table has and ID column of INT
    RoleID INT NOT NULL,
    PRIMARY KEY(UserID, RoleID)
);

And put the actual role information into a separate table as well:

CREATE TABLE `Role` (
    ID INT NOT NULL UNIQUE AUTO_INCREMENT,
    Name VARCHAR(64) NOT NULL
    PRIMARY KEY (Name)
)

Now you can have multiple roles per User and use queries like

SELECT
    `U`.`Name`
   ,`U`.`Surname`
   ,`U`.`Email`
   ,`U`.`Phone`
   ,GROUP_CONCAT(`R`.`Name`) `Roles`
FROM `User`
INNER JOIN `UserGroupCoupling` `UGC` ON `UGC`.`UserID` = `User`.`ID`
INNER JOIN `Role` `R` ON `R`.`ID` = `UGC`.`RoleID`
GROUP BY
    `U`.`Name`, `U`.`Surname`, `U`.`Email`, `U`.`Phone`

Which would give you the basic User details and a comma seperated list of all assigned Role names.

In general, the best way to normalize a database structure is to make the tables as generic as possible without being redundant, so don't add administrator or customer specific details to the user table, but use a relationship between User and Administrator to find the specific administrator details. The way you're doing it now isn't really normalized.

I'll see if i can find my favorite book on database normalization and post the ISBN when I have time later.

Kris
Normalization does indeed remove duplicates but also tends to create additional tables as you stated, anyway my intent is to keep different kind of users into different tables for security purposes (SQL Injections).
Gianluca Bargelli
the risk of SQL Injection is a good reason to use things like stored procedures, prepared statements and common sense when accessing data based on user input. I don't see how it relates to splitting up a table of same data into multiple tables though. If you have several million users in all those tables it might be faster, but also harder to maintain.
Kris
I forgot to tell you that these tables shares only a subset of values (name, surname, email, phone) in common, the others columns are completely different (is not a partition of the same data). About the security, for example if a malicious attacker gains data from a particular table in the database at least he doesn't get the whole thing, even i'm sure there are better ways to accomplish this task.
Gianluca Bargelli
@Gianluca: that changes things a bit on the data partitioning end, and you you are more or less correct about the data vulnerability, though you're mitigating a sympton instead of handling the potential cause of trouble. I hope your data access is done with something like PDO and you're using prepared statements and such.
Kris
I agree with you, all components in a system must be equally solid! At the moment i'm using mysqli but i'll definitely pass to PDO (I heard that prepared statements are much easier to handle). Even if this doesn't answer my original question thank you for leading me in the right way of thinking!
Gianluca Bargelli