Spam protection

Internet , Security , Malware , Terminology , Information Technology , Spam 1 Comment »

Wouldn’t it be great if all unsolicited commercial email, or spam as it’s more commonly known, came with a tag in the subject line that identified it as spam?  Not surprisingly, the people who send spam don’t see the value in that idea.  The job of defending networks and users against spam is on the shoulders of network administrators and users.  Understanding what defences are available and how they work will help users and network administrators choose the best solution for their environment.

Spam filtering can be done in any of four points in the path of an email; as it is sent, before it reaches the destination server, on the destination server, or at the end client.  Checking for spam as it is sent seems, on the outside, to be futile since someone intending to broadcast spam would certainly not do it.  In fact, a large portion of spam is sent using malware that is unknowingly installed on end user hosts.  Checking for spam leaving a host would identify that there is a problem which could then be corrected using malware tools.  Companies that host their own mail servers, as well as many hosted servers check email for spam as it arrives or while it is being processed by the mail server.  Filtering spam before it arrives at the server reduces network traffic and isolates the server from malware that may be contained in spam, but often limits the ability for users to allow email that may appear to be spam from people or businesses that are known.  This is called white listing a sender or email domain.  Moving the spam filter to the mail server often adds simpler configuration tools to allow administrators or users to adjust the white list (desired) and blacklist (malicious) settings either on a global or per user basis.  Client based filtering allows users to individually decide how to identify spam and how to act on each message based on the rules that they configure.  Any or all of these types of filtering can be used depending on how much control a network administrator wants to retain or offload to users and most solutions are a hybrid mix.

Regardless of what stage on the email flow a filter is testing for spam, the same criteria can be used.  Most spam filters use a number of tests in a certain order with specific settings for action and logging for each test.  The best filter will allow all desirable mail through and block all spam, but since the spammers are constantly fighting the filters, new rules must constantly be implemented.  Having a subscription with the spam filter vendor allows the filter to be updated with defences against new spamming methods very soon after they are discovered.  Here are some of the more common methods included in spam filters to detect and protect against spam.

Heuristics/Bayesian analysis – These intelligent filters learn and use statistics to determine if a message is spam.

Reputation – Most anti-spam vendors keep databases of known spammers and tag any mail from their domain or IP address as spam.

Phishing – The scan engine looks for links in emails to known phishing sites and tags them.

RBL – Realtime or Relay blacklist – These are third party lists of reported spammers on the Internet.  These blacklists often list IPs and domains that have temporary issues with malware, or are part of a large range of addresses.

Header checking – The scan engine compares the SMTP and MIME email addresses in the message header to make sure they match as well as other header anomalies.  Email clients set the MIME address, while the server sets the SMTP address.  Spammers use this to mask the origin of an email.

Directory harvesting – Emails sent to multiple non-existent accounts in an email domain are marked as spam.  The term directory harvesting is used because by sending large lists of names to a mail server, most will bounce back as failed, but some may not.  Eventually the sender can, by process of elimination, build a list of valid email addresses on the domain.

SPF – Sender Policy Framework – A relatively new DNS record type that defines domain names and hosts that are authorized to send email on a given domain.  Although not widely used initially this method of protecting a domain has become more common.

rDNS – Reverse Domain Name Service – Where DNS takes a domain name and translates it to an IP address, rDNS looks at the IP address and ensures that the domain name in the email header matches the domain name that points to the IP.  This identifies servers that are relaying mail for other, typically unauthorized, domains.

Blacklist – A list on the anti-spam server that is generated automatically or by user intervention that identifies domains or IP addresses that users have reported as spam.

Keyword checking – A list generated automatically or by user intervention or words that are in undesirable emails.  These words could be profanity, pornographic, pharmaceutical, or any other s.

Regardless of which type of spam filtering is used and which tests are implemented, all but the most expensive appliances require configuration and some amount of learning.  Don’t expect a product to stop all spam right out of the box.  With some adjustments and ongoing updates they are great tools to keep users productive.

DNS

1 Comment »

Like many concepts in computer networking, DNS is a simple function that is difficult to grasp initially.  DNS is an acronym for Domain Name System and the role it plays is very similar to that of a telephone book.  If we look at how computers communicate over networks or over the Internet in a very basic form, it is all based on numbers.  Like telephones, every device or host has a unique number assigned to it called an Internet Protocol (IP) address.  This can be a little confusing because your computer and mine may currently have the same IP address, but chances are that neither is connected directly to the Internet.  My computer is on a network that has a firewall connecting to the Internet.  My firewall has a unique IP address on the Internet even though my computer may not.  You could look at this like peoples’ names in that there are a number of entries for Siverns in the phone book, but if you call my number there is only one Glenn in the house.  It is not typical for someone to remember someone by their telephone number or their street address.  The designers of the Internet understood this and created DNS as a system that is similar to a telephone book for people and other computers to use as a directory to locate hosts on the Internet.

When configuring the IP settings on a computer a DNS server address is defined.  This may be done manually or, for a computer receiving its IP from a Dynamic Host Configuration Protocol (DHCP) server, automatically.  A host is also typically given a name to identify itself.  DNS servers keep records of host names and the IP addresses associated with them.  If an IP address is changed, users and computers can still find the host by name using a DNS server.  DNS can also work closely with DHCP servers to keep records up to date without the need for a system administrator to manually update them.

On the Internet there are a number of “root” servers that are used to direct traffic to other servers which hold the DNS records of hosts.  When attempting to access a host such as www.basicbusiness.com, DNS will first look for the record in the DNS server that is configured on the host.  If the record is not found the DNS server will forward the request to the root server that holds records for the root level domain .com.  This server will have a record for the domain registry server which knows where the basicbusiness.com records are stored.  The request will then be forwarded to that DNS server which will reply with the IP address of the host “www” in the basicbusiness.com domain.  You could think of this process as being similar to looking for a phone number for Glenn Siverns in Vancouver, BC, Canada.  You might go to a library to find a shelf with Canadian phone books and select the Vancouver book which would contain the number.

Many problems related to accessing web sites on the Internet can be attributed to DNS problems.  When an Internet browser reports that if is unable to find a web site, yet the computer can still receive email, the problem is very likely a name resolution issue attributed to DNS.  Here are some terms you may hear when someone is helping diagnose a DNS issue:

 IPCONFIG /all– This command will display the IP configuration of a host.  When you type IPCONFIG at a command prompt, the computer will display the IP address of the host, the subnet mask, and the gateway address.  IF the command “IPCONFIG /all” is entered, the computer also reports the DNS server addresses among other things.

PING – Similar to the use of sonic pings in submarines, ping is used to see if something responds.  Ping can be used to identify or verify a host by name or by IP address.  For example, if you type “PING www.basicbusiness.com” at a command prompt the command should list the IP address assigned to www.basicbusiness.com and send a number of packets to the IP address to see if it responds.  Note that many hosts will not respond because of security configurations, but if the ping command is able to determine the IP address then DNS is likely working.  Another useful test may be to try to ping a host by IP address if it’s known.  If the host responds to a ping by IP address, but not by name then DNS is not resolving names properly.  If neither test gets a response then there may be other network issues.

I hope this takes some of the mystery out of DNS and helps point you in the right direction next time you have a problem with accessing the Internet.

Photos and pictures

Terminology 1 Comment »

I recently received some digital photos from a friend and noticed that the files were named differently from the ones from my own camera.  I was wondering why different camera manufacturers would use different file formats, so I did a little digging.

Camera manufacturers are always trying to get an edge by providing better features and image quality.  Much of the image quality is determined by the lenses, but once the picture is snapped, the image data needs to be stored.  I had a look at the specs on a high end camera, the Canon EOS 1DS Mark III digital SLR and they listed the image format as:

DCF 2.0 (Exif 2.21): JPEG, RAW and RAW+JPEG simultaneous recording possible. Multiple options for recording images on two memory cards, and onto compatible external USB hard drives (via optional Wireless File Transmitter WFT-E2A)

Since this is a high end camera, I thought that someone considering this purchase would be well up on the terminology and acronyms, so I looked at one of the consumer rated models instead.  The Canon PowerShot SX 120 IS Digital Camera had two separate specs for file format and compression:

Design rule for camera file system, DPOF (Version 1.1) compliant

Still Image: Exif 2.2 (JPEG)
Movie: AVI (Image data: Motion JPEG; Audio data: WAVE (Monaural))

Thinking that maybe it was just common for Canon to use these acronyms, I checked other manufacturers and was greeted with terms like CCD-RAW, TIFF, DPOF, and BMP.  There are some important things to look for in a digital camera, but choosing the right specs to compare may depend on your personal needs. First some basics about image files.  Image data can be stored in two very basic formats; raster and vector.  Vector graphics are more suited to line and shape drawings, including 3D images.  The basic premise of vector graphics is to define a starting point for a line, then a direction and distance.  The direction can be on two planes (2D) or three (3D).  Other parameters such as colour, fill, and weight are also defined.  Vector graphics are commonly used in computer aided design (CAD) drawings.  Raster graphics use a method similar to that of a television screen where the image starts forming by defining a single dot or pixel, then moving to the adjacent pixel and defining it.  This would continue across one dimension of the image, then move to the next line or row until the entire image is formed.  The definition would include colour and intensity for each dot.  As you can guess, vector graphics take relatively less space to store because they can define large areas with just a few parameter definitions.  Raster graphics by contrast, must define each pixel in the image.  To combat this space requirement, a number of different image formats and compression schemes have been developed.  The best quality images have no compression, but take the most room.  Because of the detail that can be stored is raster files they are used as the storage format for cameras.  Here are some descriptions that may help:

CCD - Charge Coupled Device.  This is the special device in digital cameras that convert light to electronic data.  They are an array of microscopic devices on one silicon chip and usually number in the millions.  The term megapixel refers to the number of pixels in a CCD.  More pixels allow the CCD to detect more detail in an image.

RAW - This is not an acronym.  Image data is taken directly from the CCD and stored on the storage media.  RAW images are created very quickly, but take a lot of space.  If space is not an issue, then this is the best format for a camera.

DCF - Design rule for Camera File system.  This is a standard adopted by a number of camera manufacturers to define a standard format for camera images file systems.  The standard includes folder structure and EXIF formatted files and has been adopted by most camera manufacturers.

EXIF - Exchangeable Image file Format.  EXIF adds structure and photographic information to existing image formats to standardize the file structure and help compatibility.  Information such as shutter speed, camera manufacturer and orientation are among the parameters defined.  Various versions of EXIF are in use by most camera manufacturers today.

JPEG - Joint Photographic Experts Group defined this standard for image compression.  There are a number of levels of compression available which decrease the storage footprint at the expense of image quality.  If you're looking for a camera that can store many pictures, then look for JPEG formats.  A low level of compression will retain most of the image quality with a good saving of storage space.

TIFF - Tagged Image Format was developed by Adobe Systems in an attempt to create a common image standard for scanners.  TIFF acts as a container which defined the geometry and other parameters of an image and may contain images in other formats such as JPEG.  TIFF formats with embedded JPEG give the same storage saving as JPEG, but don't add much additional value.

DPOF - Digital Print Order Format adds definitions for output to images.  These definitions may include number of copies, output image size, orientation, and title text.  This standard was developed by a consortium of camera manufacturers to simplify printing of digital images.  DPOF was not widely adopted and has become unneccessary for digital printing.

BMP - Bitmap images were developed by Microsoft as a high quality uncompressed image format.  The large size of BMP images makes them unsuitable for cameras or as images on web pages.

PNG - Portable Network Graphics files are based on the BMP with the addition of compression for use on the Internet.  The compression of PNG results in too much image quality loss for most camera use.

While there are many other formats defined and in use today, I hope this sheds some light on what to look for when choosing an image format or a camera that uses them.

Passwords

Internet , Security , What's new? No Comments »

These days we seem to need a PIN or a password for just about everything.  If you follow everyone's recommendations, you'll end up with countless sets of random characters that you are expected to memorize and change on a regular basis.  That seems a little unreasonable for all but the few people who have perfect recall.  While I agree that these rules are important to keep your personal information and finances secure, I think there needs to be a happy medium where the risk matches the required effort.  It's a little risky for me to make suggestions about password security, so please remember that anything short of a completely random mix of characters that only you have memorized is at some level a security risk.

Good security involves three components; something you have, something you know, and something you are.  The something you have may be a bank card, security card, or key fob.  Something you know would be a PIN or password, and something you are is typically a biometric like a fingerprint or retina scan.

PIN numbers (Yes, I do know that the N in PIN stands for number, but it flows better) are used widely in the financial arena to verify that the holder of the card is actually the person authorized to use it.  The weaknesses are that the third component, something you are, is still missing and the other two can be stolen or copied.  Bank cards go missing all the time, but fortunately most do not have the owner's PIN printed on the back.  If a card is stolen by someone who really wants to gain access your best protection is a hard to guess PIN.  Obviously birthdates and anniversaries are not good options as are easy to spot numbers like 5555.  Choose a PIN that is random, then come up with a way to remember it.  For example 4516 could be remembered by the word "deaf" which is made up of the 4th, 5th, 1st and 6th letters of the alphabet.  Patterns on the keypad as you punch the numbers in are sometimes helpful too. One financial institution uses a combination of a couple of passwords for online banking, but only asks for certain characters each time you connect, so the entire password is never typed in a single session.

Email passwords and web site passwords use only one of the three components of good security; something you know.  These passwords are usually at more risk because most information passed over the Internet can be seen by malicious people with basic hacking skills.  Emails are sent in clear text, which means they should never contain passwords, credit card numbers or other important information.  If you're like me, you probably have a number of email addresses and access to many password protected web sites.  My memory is pretty good, but there's no way I could memorize random characters for over 100 accounts and change them on a regular basis.  My solution has been to categorize the email accounts and web sites according to their importance and risk to me.  Some accounts have a unique password, but others are in a category that contains a number of accounts with the same password.  I also use variations of a password in some cases so that I can remember them while maintaining a good level of security.  Newer computers often come with fingerprint readers and password "vaults" where you can store a number of passwords and only access them with a combination of a fingerprint and a password.  Since someone you are is the most secure of the three components of good security, this combination is a good option for keeping your information secure.  I would, however, advise that you keep a list in a safe deposit box as well since hardware can fail or be stolen and you may lose access yourself.

There are many technologies that have been in use for years that are now coming into every day use, as well as improvements on the forefront, such as biometrics in payment cards.  As with any security, the best defence is always knowledge.  If you know what's risky you can avoid it.  With that in mind, I have a few copies of a security reference handbook written by Symantec that I will make available to the first three people who post comments.  Follow up with an email directly to me at gsiverns@basicbusiness.com with your address so I can send you the booklet.

 

Printing

Terminology , Information Technology No Comments »

Printed documents are created by attaching ink to paper.  The process involved and the quality of the output varies widely depending on the type of printer used.  Historically there have been many methods of printing and duplicating, but I will only talk about computer printers in this blog.

Printers are classified in one of three categories based on how they produce output.  These categories are character printers, line printers and page printers.  As you might guess the output is transferred to paper one character at a time, one line at a time or one page at a time respectively.  The most popular early printers were dot matrix printers.  These could be either character or line printers.  Dot matrix line printers typically had a row of small wires lined up across the width of the paper with an ink saturated ribbon between.  The ribbon is fed horizontally across the row of wires while the paper is moved vertically.  For each vertical step of the paper a combination of wires would be pushed at the paper causing the ribbon to touch and leave an entire line of dots across the page.  As the paper continues to move up the characters of the output are formed.  These printers are very fast, but are usually limited to text or basic graphics.

Another type of dot matrix printer uses a moving print head with a small number of wires (typically 9 or 24) placed vertically.  The head is moved horizontally across the paper while an ink ribbon is fed horizontally across the head.  As the head moves across the wires are fired in a pattern to create each character of text one at a time.  The head may fire as it travels in one direction or both depending on the printer and the quality of output.  After a line of text has been completed the paper is fed vertically to position for the next line.  Higher quality text could be created using a printwheel or ball printer.  These printers used the same method as many typewriters spinning either a spoked wheel with embossed characters or a gimbled ball with embossed characters.  In the case of a printwheel, a single hammer would fire to push the character on to the paper with the ribbon between leaving the character impression.  Ball printers would lift the entire ball to the paper with the correct character positioned to push the ribbon against the paper.  Neither of these types of printer could usually print graphics, but different fonts were available by changing the wheel or ball.  These are also considered character printers and all of the technologies I’ve mentioned are able to print multi part forms because they use impact to transfer the image.

Newer technologies have replaced most dot matrix printers.  Ink jet printers are character printers that are very similar to dot matrix character printers except that they force ink directly onto the paper through tiny nozzles on a moving head.   The head is moved across the paper while small dots of ink are ejected using either bubbles created by heat or electrically induced constrictions of the nozzles.  As with the dot matrix printers the dots form the characters as the head moves.  Because there is no impact, multi-part forms are not able to be printed.  The small size of the dots used to form the images allow for very high quality text or graphics to be printed.  Most inkjet printers also use multiple print heads moving together to produce full colour output.  Although there are a number of printers that use six or more heads, it is more common to have only four; black, cyan, magenta and yellow.  The cyan, magenta and yellow can be placed on the paper in close groups of dots causing your eye to see the combined colour.  This allows for a full spectrum of colour reproduction.  Although these printers produce very high quality output, they are usually quite slow and expensive to operate due to the high cost of ink.

For better speed and lower cost of operation without too much sacrifice in image quality there are laser printers.  Laser printers are considered page printers because the entire page or text or graphics are produced within the machine, then transferred to the paper.  There are a number of laser technologies currently in use, but the most common uses a statically charged rotating drum.  As the drum rotates a laser beam is fired at the surface.  A spinning mirror directs the laser from side to side on the rotating drum as it is pulsed on and off.  Where the laser hits the surface of the drum the static charge is removed.  Farther along the rotation the surface of the drum is passed by a brush of charged ink particles called toner.  The toner is attracted to the drum surface where there is no charge laying an image.  Paper is then fed between the drum and another charge which attracts to toner to the paper.  The paper is then fed between a hot roller and a pressure roller to fuse the image to the page.  Colour images are made by rotating the drum four times past four different toner brushes, cyan, magenta, yellow and black, before feeding the paper past to receive the image.  Laser printers are usually more expensive to purchase than inkjet printers, but cost less to operate over time.   

Powered by Mango Blog. Design and Icons by N.Design Studio
RSS Feeds