Random Thoughts on SQL Server

Recently, I created a user-defined SQL Server function to return random characters to be used as an initial password when signing up for an account.   Without knowing a few things about the RAND() function in advance, this fifteen-minute project could take you several hours.   The actual code for creating the random password I will discuss in a future article (http://www.dbtechie.com/sql-server/user-defined-functions-udf/easy-random-text-generator-using-selective-characters/)    In this article, I will discuss a few of the pitfalls of the RAND() function.

                                        

Why can’t I use RAND() in a user-defined function and how do I get around it?

The first thing to know is that the RAND() function (which returns a float between 0 and 1) is one of the nondeterministic built-in functions that cannot be called from a user-defined function (UDF).  Per msdn (http://msdn.microsoft.com/en-us/library/ms178091(SQL.90).aspx), “Deterministic functions always return the same result any time they are called with a specific set of input values and given the same state of the database. Nondeterministic functions may return different results each time they are called with a specific set of input values even if the database state that they access remains the same.”  For a list of nondeterministic functions that can and cannot be used in UDF, see this msdn link:  http://msdn.microsoft.com/en-us/library/ms191320(SQL.90).aspx.

If you tried using RAND() in your user-defined function, you probably saw the message, “Invalid use of side-effecting or time-dependent operator in ‘rand’ within a function.”   In other words, RAND() is one of the nondeterministic functions that SQL won’t let you use in a UDF.  The solution is simple.  Put the RAND() function in a view and refer to the view in the UDF.

CREATE VIEW [dbo].[vw_RandomFloat]

AS

SELECT     RAND() AS RandValue

—–

CREATE FUNCTION MyRand()

RETURNS FLOAT

AS

BEGIN

      RETURN (select RandValue from dbo.vw_RandomFloat)

END

—–

SELECT dbo.MyRand()

 

Why aren’t my values random?

Using RAND() without parameters  works just fine if you only plan on getting one random value per transaction.  RAND() is considered a pseudo-random function, because it depends on a seed value to determine where to start returning a predetermined sequence of values.   The same seed value will always return the same sequence of values.  RAND() is “seeded” with an integer to randomize the values returned.  The classic way to randomize is by seeding using a function from the internal clock of the computer, which is how the RAND() function will seed if you do not specify a seed value.   The problem is that if you include the RAND() function in a SELECT statement that returns multiple rows, the same seed value is used for every row in the result set of the SELECT statement.

We now find ourselves with the paradoxical problem of needing to find a way to create a random integer to seed the random function.  At first, you may think to use an identity column as a seed, but you will quickly be discouraged with this method because seed values that are close in value also return values that are close in value.  For example, RAND(1) returns 0.713591993212924 and RAND(2) returns 0.713610626184182.  These values are close enough that, for all intents and purposes , they might as well be the same number.

The best solution I have seen is to seed using the NEWID() function.  NEWID() returns a uniqueidentifier, so there is still a tiny bit of work left to convert it to an integer to seed the RAND() function.   The easiest is by using the CHECKSUM() function: Checksum(NEWID()).  You could also use the CAST() function: CAST(CAST(NEWID() AS VARBINARY) AS INT).

So, the new and improved random view would be:

CREATE VIEW [dbo].[vw_RandomFloat]

AS

SELECT     RAND(Checksum(NEWID())) AS RandValue

1 comment:

  1. UDF to return random password | DBTechie (Pingback), 1. July 2009, 20:14

Write a comment: